17

Lesson 17 of 20 ยท Patterns and Data

Natural language as data

Text requires special preprocessing: tokenization, stop word removal, stemming, and encoding. Modern NLP uses subword tokenization like BPE.

  • Text preprocessing converts language to numbers.
  • BPE tokenization handles subwords efficiently.

Think about it

What is the ROC curve?

Your Cart (0)

Your cart is empty

Browse our shop to find activities your kids will love

Natural language as data โ€” Patterns and Data | 8th Grade AI for Kids | LittleActivity | LittleActivity