• 2 hours
  • Easy

Free online content available in this course.

course.header.alt.is_certifying

Got it!

Last updated on 8/19/24

Analyze Data Contained in Files With GPT-4

Chapter banner

ChatGPT doesn’t just work with text and images. It can also work with entire documents!

In this chapter, we’ll explore some quite advanced features that are truly impressive. They use a feature called “Code Interpreter” that significantly enhances ChatGPT’s capabilities. It can create programs and run them by itself!

Read and Summarize a .pdf File

Start by uploading a .pdf file to ChatGPT by clicking the small attachment icon to the left of the text box (the same icon used for uploading images).

Screenshot of the ChatGPT interface
File upload icon. You can upload several different file types.

If you can’t think of any ideas, you can practice by using the OpenClassrooms Mission Report.

Next, ask it a question about the .pdf file!

Screenshot of the ChatGPT interface
ChatGPT reads the document you uploaded.

Here are a few ideas to help you...

Summarize

Perhaps the most frequently used feature is to request a summary of the file:

🧔🏻‍♂️: [File] Briefly summarize this .pdf file

🤖: The OpenClassrooms 2023 Mission Report details the organization’s continued commitment to making education accessible to everyone. By focusing on improving the way they assess their impact, OpenClassrooms has achieved significant progress by sharing the positive career outcomes of its students. A large proportion of students come from underserved groups, demonstrating OpenClassrooms’ commitment to inclusion.

[...]

Getting a summary in just a few seconds is great—a real time saver! 😀

Identify the Structure of the Document

Personally, I like to understand the structure of a document. Ask it for the document structure (titles and subtitles):

🧔🏻‍♂️: [File] Give me the details of the document’s summary

🤖:

The summary of the OpenClassrooms Mission Report 2023 is divided into several major sections, detailing various aspects of the organization and its impact:

  1. **Quantitative analysis**:
    - Free courses: Discussion of the use and impact of free courses, often referred to as MOOCs.
    - Paid training programs: Analysis of students enrolled in paid training programs.
    - Career outcomes: Explanation of the criteria OpenClassrooms uses for recording career outcomes.

  2. **Qualitative analysis**:
    - Who are our underserved students?: Definition of the groups targeted by OpenClassrooms.
    - Success rates by category: Comparison of career outcome rates between the various student categories.

    [...]

Identify the Key Statistics

Need the raw data to get straight to the point? Then get creative and request what you want to extract. For example, if you only want the main indicators contained in the document:

🧔🏻‍♂️: [File] What are the main indicators presented and what are their values?

🤖:

The OpenClassrooms Mission Report 2023 presents various key indicators, including the following along with their values:

### Free courses:
- **Active users (monthly average)**: 287,000
- **Courses taken throughout the year**: 2,203,949
- **Countries with free users**: 183
- **Emissions per user (free and paid courses)**: 0.083 kg CO2eq

### Paid training paths:
- **People applying to a training path in 2022**: 37,218
- **Percentage of students who are women**: 31.2%
- **People accepted to a training path in 2022**: 16,940
- **Students starting a training path in 2022**: 10,915

[...]

Analyze Data

Let’s look beyond .pdf files. What about analyzing a dataset? 🤓

You can upload files to ChatGPT in many formats, including:

  • .csv

  • .xls

  • .xlsx

  • .json

  • .xml

  • etc.

ChatGPT will write the code (in Python) to open the file, analyze it, and so on. You can ask it to do a lot of things!

For this example, I’ll use the financial data file of a fictitious company provided by Microsoft. You can also download the file if you want to try it out for yourself.

Request a Basic Analysis

Start by asking for a basic analysis of the file. ChatGPT will provide you with the most important information:

🧔🏻‍♂️: [File] Perform a basic analysis of this file

🤖:

Here is a basic analysis of the “Financial Sample” dataset:

### General Information:
- The dataset contains 700 entries and 16 columns.
- Data types include objects (strings), floats, integers, and dates.

### Descriptive Statistics (for numeric columns):
- **Units Sold:** Ranges from 200 to 4492.5, with an average of 1608.3 units.
- **Manufacturing Prices:** These vary widely from 3 to 260, indicating a broad range of product costs.
- **Sales Price:** Ranges from 7 to 350, reflecting a diverse product range.
- **Gross Sales:** Ranges from 1,799 to 1,207,500, with an average of 182,759.

[...]

Request Possible Analysis Ideas

Here, you can ask ChatGPT for a more detailed analysis. You can ask a variety of questions, such as:

🧔🏻‍♂️: [File] Which market segment is the most profitable? The least profitable?

🤖:

The most profitable market segment is **Government** with a total profit of 11,388,173.17. The segment generating the least profits is **Company** with a total profit of -614,545,625, indicating a loss in this segment.

Play around with it because it can save you a lot of time!

Generate Graphs

ChatGPT can also generate graphs. Let’s see what happens...

Request Visualization Ideas

Make a request if you know what you want from the file.
Otherwise, if you need some inspiration, ask ChatGPT for a few ideas:

🧔🏻‍♂️: [File] Give me 10 ideas of trends and visualizations we could make with this data

🤖:

  1. **Sales Over Time**: Create an online graph to visualize how sales have changed per month and per year. This could help to identify seasonal trends or overall company growth/reduction.

  2. **Profit by Product**: A bar chart showing total profit for each product. This would highlight which products are the most and least profitable, which could influence future sales strategies or product development.

  3. **Distribution of Sales by Country**: Use a pie chart or choropleth map to show the distribution of sales in different countries. This would provide insight into which markets are the most lucrative.

[...]

Start the Visualization Creation

Now that we have visualization ideas, we can ask ChatGPT to create them for us!

🧔🏻‍♂️: [File] Generate the first two suggested visualizations

🤖:

 

Charts generated by ChatGPT
ChatGPT generates the visualizations directly in the chat!

ChatGPT can generate various visualizations, such as:

  • Line graph

  • Histogram

  • Diagram

  • Pie chart

  • etc.

Generate Geospatial Visualizations

You can do much more with the “Code Interpreter” feature than what I demonstrated in this chapter. For example, it can generate geospatial visualizations without even having to upload a file:

🧔🏻‍♂️: Display a map of Europe with large or small markers on each country, depending on the latest population figures for the countries from your last update.

🤖: I created a map of Europe with markers on each selected country, where the size of the marker is proportional to the country’s population. Click the following link to view the map: View the European population map.

You can then download the HTML file that ChatGPT generated for you. Open it and admire the result!

A map generated by ChatGPT
A map generated by ChatGPT

Compress Images

You can ask ChatGPT to compress images and even reduce their file size!

If you have a file that’s too big, you can upload it and request, for example, that you’d like it to be 800 px wide, while keeping the same proportions but with more compression to reduce the file size.

In my case, I’ll just upload the file and ask it to optimize the size:

🧔🏻‍♂️: [Image] Compress this image to optimize the file size

🤖:

The dimensions of the image remain the same after optimization, at 2048×1536 pixels, but the file should now be smaller in size due to the compression.

You can download the optimized image using the following link:

Optimized image

Screenshot of the ChatGPT interface
ChatGPT compressed the uploaded image

My image went from 3.2 MB to 730 KB! All while maintaining the same image quality.

Let’s Recap

  • ChatGPT not only works on text and images, but it can also run programs with the “Code Interpreter” feature, accessible with a ChatGPT Plus subscription and GPT-4 activated.

  • Upload a .pdf file and ask ChatGPT to perform advanced analyses, such as summarizing the content, detailing the summary, or extracting key figures and indicators.

  • For datasets in formats such as .csv, .xls, .json, etc., ChatGPT can perform basic analyses, provide more detailed analyses, and even generate graphs to visualize trends.

  • ChatGPT can also generate interactive maps and geospatial visualizations. 

  • ChatGPT can also compress images to optimize their file size without any significant loss of quality.

You’ve already seen how ChatGPT can generate text. In the next chapter, we’ll also see how it can generate images!

Example of certificate of achievement
Example of certificate of achievement