PySpark Best Practices

PySpark Best Practices

Open Data Science via YouTube Direct link

cloudera

1 of 22

1 of 22

cloudera

Class Central Classrooms beta

YouTube playlists curated by Class Central.

Classroom Contents

PySpark Best Practices

Automatically move to the next video in the Classroom when playback concludes

  1. 1 cloudera
  2. 2 Spark Execution Model
  3. 3 PySpark Driver Program
  4. 4 How do we ship around Python functions?
  5. 5 Pickle!
  6. 6 DataFrame is just another word for...
  7. 7 Use DataFrames
  8. 8 REPLs and Notebooks
  9. 9 Share your code
  10. 10 Standard Python Project
  11. 11 What is the shape of a PySpark job?
  12. 12 PySpark Structure?
  13. 13 Simple Main Method
  14. 14 Write Testable Code
  15. 15 Write Serializable Code
  16. 16 Testing with SparkTestingBase
  17. 17 Testing Suggestions
  18. 18 Writing distributed code is the easy part...
  19. 19 Get Serious About Logs
  20. 20 Know your environment
  21. 21 Complex Dependencies
  22. 22 Many Python Environments

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.