Skip to content

Data Manipulation

jwoolbright23 edited this page Mar 19, 2024 · 1 revision

EDA with pandas

In the prep work for this class, the students learned:

  1. How to manipulate data with python and pandas
  2. Applying aggregate functions across multiple columns of data
  3. Using the GroupBy method to group columns of data together
  4. Recode and map values within a column to new values
  5. The difference between wide and long format
    • Using the .melt() function to change a DataFrame from wide to long format
  6. Merge columns together

Announcements

  1. Always check with your program manager to see if there are any upcoming deadlines you should highlight!

Large Group Time (Instructor)

Topics That Require Careful Attention

  1. GroupBy method and GroupBy objects: What actually happens when you group columns together?
  2. Best practices when creating new columns (what methods are best for different scenarios)
  3. When and when not to create functions and apply them to a DataFrame
  4. Wide vs Long format
  5. .melt() function
  6. .concat() function
  7. Pivot Tables

Studio (IA Notes)

  1. Studio continues working with pumpkin data, there is a reference to the San Francisco dataset used in the Cleaning Data with Pandas chapter

Best Practices for ALL Studios

  1. Be prepared to clarify the studio instructions beyond just re-reading the words on the screen.
  2. Encourage students to work together and share ideas.
  3. Assist individuals as questions arise. Address frequent mistakes and/or questions to your whole group.
  4. Make a note of any issues that occur during the studio and provide that feedback to the instructor and LaunchCode team.

Studio (TF Notes)

  1. Students may have issues importing their dataset once into jupyter notebooks once it is downloaded.
    • They may be inputting the incorrect file path or have it stored in the incorrect location. Double check this with any learners having issues