All courses

Data

Introduction to Natural Language Processing

Natural Language Processing (NLP) allows us to classify, correct, predict, and even translate large text data quantities. In this course, you will discover how to transform text into vectors for exploration and classification. We will explore bag-of-words, word embeddings, and sentiment analysis.
Hard
10 hours
Interested in this free-access course?

Natural language processing, otherwise known as NLP, is the technology behind Siri, autocorrect, chatbots, and Google Translate. It’s what helps you translate text, filter spam, and detect fake news. In short, this technology allows a machine to understand and process human language. 

But how does it work under the hood? How can you use NLP to transform human language into something a computer can understand? Look no further; this course has the answer!

  • In Part 1 of this course, we will explore how to preprocess text data and prepare it for further exploitation by a computer.

  • In Part 2, we will explore a text vectorization technique called bag-of-words and solve text classification problems such as sentiment analysis. 

  • In Part 3, you will learn a more powerful vectorization technique called word embeddings and apply it to infer meaning from a text.

Once you complete this course, you will have a basic understanding of how NLP models work and how to use them in machine-learning projects. We will also introduce you to the spaCy 3.4, scikit-learn 1.1, and NLTK 3.7 libraries in Python 3.10.

Ready to dive into one of the most innovative domains in artificial intelligence? Then let’s get started!

Learning outcomes

  • Preprocess Text Data
  • Vectorize Text for Classification Using Bag-of-Words
  • Vectorize Text For Exploration Using Word Embeddings

Requirements

Prerequisites:

To take advantage of this course, you must be familiar with Python 3.10 and be able to use Python libraries to manipulate data. You must also be familiar with basic linear algebra and stats and the main concepts behind machine learning, including scoring and training.

If you are unfamiliar with these concepts, take the following courses: 

Required tools:

  • Python 3.10, including:
    • spaCy 3.4 
    • NLTK 3.7 
    • scikit-learn 1.1 
    • pandas 1.5

Turn it into a career

Choose one of our 100% online degree programs, and transform your know-how into professional skills.

  • Up to 100% of your training program funded
  • Flexible start date
  • Career-focused projects
  • Individual mentoring

Contributors

Instructor

Alexis Perrier

Auteur et enseignant en Data Science, expert Machine Learning. Suivez @alexip sur Twitter.

Created by

Last updated: 1/23/2025
License

Data

Introduction to Natural Language Processing

Hard
10 hours
Free-access course