- How do I encode categorical features using scikit-learn?

How do I encode categorical features using scikit-learn?

In order to include categorical features in your Machine Learning model, you have to encode them numerically using "dummy" or "one-hot" encoding. But how do you do this correctly using scikit-learn?

In this video, you'll learn how to use OneHotEncoder and ColumnTransformer to...
In order to include categorical features in your Machine Learning model, you have to encode them numerically using "dummy" or "one-hot" encoding. But how do you do this correctly using scikit-learn?

In this video, you'll learn how to use OneHotEncoder and ColumnTransformer to encode your categorical features and prepare your feature matrix in a single step. You'll also learn how to include this step within a Pipeline so that you can cross-validate your model and preprocessing steps simultaneously. Finally, you'll learn why you should use scikit-learn (rather than pandas) for preprocessing your dataset.

AGENDA:
0:00 Introduction
0:22 Why should you use a Pipeline?
2:30 Preview of the lesson
3:35 Loading and preparing a dataset
6:11 Cross-validating a simple model
10:00 Encoding categorical features with OneHotEncoder
15:01 Selecting columns for preprocessing with ColumnTransformer
19:00 Creating a two-step Pipeline
19:54 Cross-validating a Pipeline
21:44 Making predictions on new data
23:43 Recap of the lesson
24:50 Why should you use scikit-learn (rather than pandas) for preprocessing?

CODE FROM THIS VIDEO: https://github.com/justmarkham/scikit-learn-videos/blob/master/10_categorical_features.ipynb

WANT TO JOIN MY NEXT LIVE WEBCAST? Become a member ($5/month):
https://www.patreon.com/dataschool


=== RELATED RESOURCES ===

OneHotEncoder documentation: https://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-categorical-features
ColumnTransformer documentation: https://scikit-learn.org/stable/modules/compose.html#columntransformer-for-heterogeneous-data
Pipeline documentation: https://scikit-learn.org/stable/modules/compose.html#pipeline

My video on cross-validation: https://www.youtube.com/watch?v=6dbrR-WymjI&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=7
My video on grid search: https://www.youtube.com/watch?v=Gol_qOgRqfA&list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A&index=8
My lesson notebook on StandardScaler: https://nbviewer.jupyter.org/github/justmarkham/DAT8/blob/master/notebooks/19_advanced_sklearn.ipynb


=== WANT TO GET BETTER AT MACHINE LEARNING? ===

1) WATCH my scikit-learn video series: https://www.youtube.com/playlist?list=PL5-da3qGB5ICeMbQuqbbCOQWcS6OYBr5A

2) SUBSCRIBE for more videos: https://www.youtube.com/dataschool?sub_confirmation=1

3) ENROLL in my Machine Learning course: https://www.dataschool.io/learn/

4) LET'S CONNECT!
- Newsletter: https://www.dataschool.io/subscribe/
- Twitter:
- Facebook: https://www.facebook.com/DataScienceSchool/
- LinkedIn: https://www.linkedin.com/in/justmarkham/

#python #data science #machine learning #scikit-learn
Introduction - How do I encode categorical features using scikit-learn?

Introduction

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:00:00 - 00:00:22
Why should you use a Pipeline? - How do I encode categorical features using scikit-learn?

Why should you use a Pipeline?

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:00:22 - 00:02:30
" What is the point of the pipeline? The point of the pipeline is to chain steps together sequentially. Normally, you put preprocessing steps and model building steps in a pipeline. Now, why should you build a pipeline? There are two main reasons." - How do I encode categorical features using scikit-learn?

" What is the point of the pipeline? The point of the pipeline is to chain steps together sequentially. Normally, you put preprocessing steps and model building steps in a pipeline. Now, why should you build a pipeline? There are two main reasons."

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:00:26 - 00:27:59
1) It allows you to properly cross-validate a process rather than just a model. In other words, when you are doing cross-validation like cross_val_score, normally you just pass a model to it. Well, there are cases when that is not going to give you accurate results because you're doing the preprocessing outside of the cross-validation.So a pipeline, generally speaking, is useful because you can cross-validate a process that includes(a) *preprocessing* as well as(b) *model building*. - How do I encode categorical features using scikit-learn?

1) It allows you to properly cross-validate a process rather than just a model. In other words, when you are doing cross-validation like cross_val_score, normally you just pass a model to it. Well, there are cases when that is not going to give you accurate results because you're doing the preprocessing outside of the cross-validation.So a pipeline, generally speaking, is useful because you can cross-validate a process that includes(a) *preprocessing* as well as(b) *model building*.

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:00:58 - 00:27:59
Preview of the lesson - How do I encode categorical features using scikit-learn?

Preview of the lesson

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:02:30 - 00:03:35
Loading and preparing a dataset - How do I encode categorical features using scikit-learn?

Loading and preparing a dataset

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:03:35 - 00:06:11
Kevin, it's am Winston-Salem time and I am digging this. I was very confused. Thank you so much. - How do I encode categorical features using scikit-learn?

Kevin, it's am Winston-Salem time and I am digging this. I was very confused. Thank you so much.

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:05:20 - 00:27:59
Cross-validating a simple model - How do I encode categorical features using scikit-learn?

Cross-validating a simple model

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:06:11 - 00:10:00
scoring = accuracy - How do I encode categorical features using scikit-learn?

scoring = accuracy

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:09:15 - 00:27:59
Encoding categorical features with OneHotEncoder - How do I encode categorical features using scikit-learn?

Encoding categorical features with OneHotEncoder

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:10:00 - 00:15:01
I was taught that we use .transform only for the test data but not for the training data. In , you use it for the test+training data. Could you explain why you do this? Thank you! Also, your videos are great :) - How do I encode categorical features using scikit-learn?

I was taught that we use .transform only for the test data but not for the training data. In , you use it for the test+training data. Could you explain why you do this? Thank you! Also, your videos are great :)

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:12:00 - 00:27:59
Selecting columns for preprocessing with ColumnTransformer - How do I encode categorical features using scikit-learn?

Selecting columns for preprocessing with ColumnTransformer

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:15:01 - 00:19:00
When applying the 'make_column_transfromer()' at  it returns the results (e.g., columns) in different order than the input data. - How do I encode categorical features using scikit-learn?

When applying the 'make_column_transfromer()' at it returns the results (e.g., columns) in different order than the input data.

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:17:45 - 00:27:59
Creating a two-step Pipeline - How do I encode categorical features using scikit-learn?

Creating a two-step Pipeline

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:19:00 - 00:19:54
Cross-validating a Pipeline - How do I encode categorical features using scikit-learn?

Cross-validating a Pipeline

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:19:54 - 00:21:44
Making predictions on new data - How do I encode categorical features using scikit-learn?

Making predictions on new data

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:21:44 - 00:23:43
OK, you've said that we are cross validating not the model but the pipeline at  around . This might be useful in some other case, but what's the point of splitting first then applying one hot encoding? Result should be same if you do the one hot encoding first. Right? Am I missing something? - How do I encode categorical features using scikit-learn?

OK, you've said that we are cross validating not the model but the pipeline at around . This might be useful in some other case, but what's the point of splitting first then applying one hot encoding? Result should be same if you do the one hot encoding first. Right? Am I missing something?

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:22:00 - 00:27:59
Recap of the lesson - How do I encode categorical features using scikit-learn?

Recap of the lesson

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:23:43 - 00:24:50
Why should you use scikit-learn (rather than pandas) for preprocessing? - How do I encode categorical features using scikit-learn?

Why should you use scikit-learn (rather than pandas) for preprocessing?

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:24:50 - 00:27:59
Thank you so much for this video. It really helped me a lot. I do have a question about this process. In my case one of the columns of my out of sample data has more categories than my in sample data (basically, I have the opposite scenario as the one you mentioned at ). Would this process work in my case? - How do I encode categorical features using scikit-learn?

Thank you so much for this video. It really helped me a lot. I do have a question about this process. In my case one of the columns of my out of sample data has more categories than my in sample data (basically, I have the opposite scenario as the one you mentioned at ). Would this process work in my case?

How do I encode categorical features using scikit-learn?
2019年11月13日 
00:26:19 - 00:27:59

Data School

※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。

Timetable

動画タイムテーブル

動画数:141件

- Introduction - My top 50 scikit-learn tips

- Introduction

My top 50 scikit-learn tips
2023年04月20日 
00:00:00 - 00:01:03
- 1. Transform data with ColumnTransformer - My top 50 scikit-learn tips

- 1. Transform data with ColumnTransformer

My top 50 scikit-learn tips
2023年04月20日 
00:01:03 - 00:04:19
- 2. Seven ways to select columns - My top 50 scikit-learn tips

- 2. Seven ways to select columns

My top 50 scikit-learn tips
2023年04月20日 
00:04:19 - 00:08:18
- 3. "fit" vs "transform" - My top 50 scikit-learn tips

- 3. "fit" vs "transform"

My top 50 scikit-learn tips
2023年04月20日 
00:08:18 - 00:10:53
- 4. Don't use "fit" on new data! - My top 50 scikit-learn tips

- 4. Don't use "fit" on new data!

My top 50 scikit-learn tips
2023年04月20日 
00:10:53 - 00:15:05
- 5. Don't use pandas for preprocessing! - My top 50 scikit-learn tips

- 5. Don't use pandas for preprocessing!

My top 50 scikit-learn tips
2023年04月20日 
00:15:05 - 00:19:00
- 6. Encode categorical features - My top 50 scikit-learn tips

- 6. Encode categorical features

My top 50 scikit-learn tips
2023年04月20日 
00:19:00 - 00:24:07
- 7. Handle new categories in testing data - My top 50 scikit-learn tips

- 7. Handle new categories in testing data

My top 50 scikit-learn tips
2023年04月20日 
00:24:07 - 00:27:16
handle_unknown='ignore'. A most useful tip! If only I'd read the docs. But, I don't understand when you say to go back and include the previously unknown categories. How can you train on unknown data? Even if you include the unknown "labels" in your encoder, they will all be zero during training, because, obviously, they weren't in your training data. I think it's best to just leave it alone. If it wasn't in your training data, then it's probably a rare occurrence and you can just ignore it. Zeros in all known categories simplifies what happens down stream? If you want to train on unknown data, you would need to use "dummy data" and set min_frequency or max_categories, then handle_unknown='infrequent_if_exists' to give down steam modules something to work with. - My top 50 scikit-learn tips

handle_unknown='ignore'. A most useful tip! If only I'd read the docs. But, I don't understand when you say to go back and include the previously unknown categories. How can you train on unknown data? Even if you include the unknown "labels" in your encoder, they will all be zero during training, because, obviously, they weren't in your training data. I think it's best to just leave it alone. If it wasn't in your training data, then it's probably a rare occurrence and you can just ignore it. Zeros in all known categories simplifies what happens down stream? If you want to train on unknown data, you would need to use "dummy data" and set min_frequency or max_categories, then handle_unknown='infrequent_if_exists' to give down steam modules something to work with.

My top 50 scikit-learn tips
2023年04月20日  Phil Webb 様 
00:24:08 - 02:47:31
- 8. Chain steps with Pipeline - My top 50 scikit-learn tips

- 8. Chain steps with Pipeline

My top 50 scikit-learn tips
2023年04月20日 
00:27:16 - 00:30:19
- 9. Encode "missingness" as a feature - My top 50 scikit-learn tips

- 9. Encode "missingness" as a feature

My top 50 scikit-learn tips
2023年04月20日 
00:30:19 - 00:33:12
Missingness. So, what happens when a feature is fully populated in your training data, but has missing values in your validation data? Just bringing that up in case you don't get to it. - My top 50 scikit-learn tips

Missingness. So, what happens when a feature is fully populated in your training data, but has missing values in your validation data? Just bringing that up in case you don't get to it.

My top 50 scikit-learn tips
2023年04月20日  Phil Webb 様 
00:30:20 - 02:47:31
- 10. Why set a random state? - My top 50 scikit-learn tips

- 10. Why set a random state?

My top 50 scikit-learn tips
2023年04月20日 
00:33:12 - 00:35:40
- 11. Better ways to impute missing values - My top 50 scikit-learn tips

- 11. Better ways to impute missing values

My top 50 scikit-learn tips
2023年04月20日 
00:35:40 - 00:41:22
- 12. Pipeline vs make_pipeline - My top 50 scikit-learn tips

- 12. Pipeline vs make_pipeline

My top 50 scikit-learn tips
2023年04月20日 
00:41:22 - 00:44:08
- 13. Inspect a Pipeline - My top 50 scikit-learn tips

- 13. Inspect a Pipeline

My top 50 scikit-learn tips
2023年04月20日 
00:44:08 - 00:47:03
- 14. Handle missing values automatically - My top 50 scikit-learn tips

- 14. Handle missing values automatically

My top 50 scikit-learn tips
2023年04月20日 
00:47:03 - 00:49:47
- 15. Don't drop the first categorical level - My top 50 scikit-learn tips

- 15. Don't drop the first categorical level

My top 50 scikit-learn tips
2023年04月20日 
00:49:47 - 00:54:15
- 16. Tune a Pipeline - My top 50 scikit-learn tips

- 16. Tune a Pipeline

My top 50 scikit-learn tips
2023年04月20日 
00:54:15 - 01:01:09
- 17. Randomized search vs grid search - My top 50 scikit-learn tips

- 17. Randomized search vs grid search

My top 50 scikit-learn tips
2023年04月20日 
01:01:09 - 01:05:42
- 18. Examine grid search results - My top 50 scikit-learn tips

- 18. Examine grid search results

My top 50 scikit-learn tips
2023年04月20日 
01:05:42 - 01:08:10
- 19. Logistic regression tuning parameters - My top 50 scikit-learn tips

- 19. Logistic regression tuning parameters

My top 50 scikit-learn tips
2023年04月20日 
01:08:10 - 01:12:41
- 20. Plot a confusion matrix - My top 50 scikit-learn tips

- 20. Plot a confusion matrix

My top 50 scikit-learn tips
2023年04月20日 
01:12:41 - 01:15:37
- 21. Plot multiple ROC curves - My top 50 scikit-learn tips

- 21. Plot multiple ROC curves

My top 50 scikit-learn tips
2023年04月20日 
01:15:37 - 01:17:21
- 22. Use the correct Pipeline methods - My top 50 scikit-learn tips

- 22. Use the correct Pipeline methods

My top 50 scikit-learn tips
2023年04月20日 
01:17:21 - 01:18:59
- 23. Access model coefficients - My top 50 scikit-learn tips

- 23. Access model coefficients

My top 50 scikit-learn tips
2023年04月20日 
01:18:59 - 01:20:11
- 24. Visualize a decision tree - My top 50 scikit-learn tips

- 24. Visualize a decision tree

My top 50 scikit-learn tips
2023年04月20日 
01:20:11 - 01:23:57
- 25. Improve a decision tree by pruning it - My top 50 scikit-learn tips

- 25. Improve a decision tree by pruning it

My top 50 scikit-learn tips
2023年04月20日 
01:23:57 - 01:25:23
- 26. Use stratified sampling when splitting data - My top 50 scikit-learn tips

- 26. Use stratified sampling when splitting data

My top 50 scikit-learn tips
2023年04月20日 
01:25:23 - 01:29:40
- 27. Impute missing values for categoricals - My top 50 scikit-learn tips

- 27. Impute missing values for categoricals

My top 50 scikit-learn tips
2023年04月20日 
01:29:40 - 01:32:10
- 28. Save a model or Pipeline - My top 50 scikit-learn tips

- 28. Save a model or Pipeline

My top 50 scikit-learn tips
2023年04月20日 
01:32:10 - 01:33:47
- 29. Add multiple text columns to a model - My top 50 scikit-learn tips

- 29. Add multiple text columns to a model

My top 50 scikit-learn tips
2023年04月20日 
01:33:47 - 01:35:35
- 30. More ways to inspect a Pipeline - My top 50 scikit-learn tips

- 30. More ways to inspect a Pipeline

My top 50 scikit-learn tips
2023年04月20日 
01:35:35 - 01:37:28
- 31. Know when shuffling is required - My top 50 scikit-learn tips

- 31. Know when shuffling is required

My top 50 scikit-learn tips
2023年04月20日 
01:37:28 - 01:42:32
- 32. Use AUC with multiclass problems - My top 50 scikit-learn tips

- 32. Use AUC with multiclass problems

My top 50 scikit-learn tips
2023年04月20日 
01:42:32 - 01:46:04
- 33. Create custom features with scikit-learn - My top 50 scikit-learn tips

- 33. Create custom features with scikit-learn

My top 50 scikit-learn tips
2023年04月20日 
01:46:04 - 01:50:03
- 34. Automate feature selection - My top 50 scikit-learn tips

- 34. Automate feature selection

My top 50 scikit-learn tips
2023年04月20日 
01:50:03 - 01:52:24
- 35. Use pandas objects with scikit-learn - My top 50 scikit-learn tips

- 35. Use pandas objects with scikit-learn

My top 50 scikit-learn tips
2023年04月20日 
01:52:24 - 01:53:37
- 36. Pass parameters as keyword arguments - My top 50 scikit-learn tips

- 36. Pass parameters as keyword arguments

My top 50 scikit-learn tips
2023年04月20日 
01:53:37 - 01:55:23
- 37. Create an interactive Pipeline diagram - My top 50 scikit-learn tips

- 37. Create an interactive Pipeline diagram

My top 50 scikit-learn tips
2023年04月20日 
01:55:23 - 01:57:22
- 38. Get the names of transformed features - My top 50 scikit-learn tips

- 38. Get the names of transformed features

My top 50 scikit-learn tips
2023年04月20日 
01:57:22 - 01:59:32
- 39. Load a toy dataset into pandas - My top 50 scikit-learn tips

- 39. Load a toy dataset into pandas

My top 50 scikit-learn tips
2023年04月20日 
01:59:32 - 02:01:33
- 40. View all model parameters - My top 50 scikit-learn tips

- 40. View all model parameters

My top 50 scikit-learn tips
2023年04月20日 
02:01:33 - 02:03:00
- 41. Encode binary features - My top 50 scikit-learn tips

- 41. Encode binary features

My top 50 scikit-learn tips
2023年04月20日 
02:03:00 - 02:06:59
Drop=if_binary makes sense, otherwise you have two columns which are perfectly redundant, not just implied. At least, it's a happy compromise. My only hesitation, without playing with it, is that the order is probably alphabetic. If it assigned 0 to the most frequent category, then handle_unknown=ignore would make sense. Otherwise, you're lumping unknowns in with the "least" alphabetic category. That's kinda silly. - My top 50 scikit-learn tips

Drop=if_binary makes sense, otherwise you have two columns which are perfectly redundant, not just implied. At least, it's a happy compromise. My only hesitation, without playing with it, is that the order is probably alphabetic. If it assigned 0 to the most frequent category, then handle_unknown=ignore would make sense. Otherwise, you're lumping unknowns in with the "least" alphabetic category. That's kinda silly.

My top 50 scikit-learn tips
2023年04月20日  Phil Webb 様 
02:03:00 - 02:47:31
- 42. Column selection tricks - My top 50 scikit-learn tips

- 42. Column selection tricks

My top 50 scikit-learn tips
2023年04月20日 
02:06:59 - 02:10:02
Hopefully, you'll never have 200 columns to passthrough, but I think specifying which columns to passthrough makes what you intend clearer. The default is remainder=drop, so the author thought that as well. - My top 50 scikit-learn tips

Hopefully, you'll never have 200 columns to passthrough, but I think specifying which columns to passthrough makes what you intend clearer. The default is remainder=drop, so the author thought that as well.

My top 50 scikit-learn tips
2023年04月20日  Phil Webb 様 
02:09:40 - 02:47:31
Yeah, if you have the time and the determination, you could run DecisionTreeClassifier, then plot_tree, and look through it for conditions like name != value. Then, you could use the order the decision tree "discovers" categories as the ordinal value for that feature, 0 being first. You just need to write a custom transformer to preprocess your validation data and assign -1 to all unknowns. Another trick I've had success with is ordering by frequency, with 0 being the most frequent. In that case, your custom transformer should assign 0 to all unknowns. Easy-peasy. - My top 50 scikit-learn tips

Yeah, if you have the time and the determination, you could run DecisionTreeClassifier, then plot_tree, and look through it for conditions like name != value. Then, you could use the order the decision tree "discovers" categories as the ordinal value for that feature, 0 being first. You just need to write a custom transformer to preprocess your validation data and assign -1 to all unknowns. Another trick I've had success with is ordering by frequency, with 0 being the most frequent. In that case, your custom transformer should assign 0 to all unknowns. Easy-peasy.

My top 50 scikit-learn tips
2023年04月20日  Phil Webb 様 
02:10:00 - 02:47:31
- 43. Save time when encoding categoricals - My top 50 scikit-learn tips

- 43. Save time when encoding categoricals

My top 50 scikit-learn tips
2023年04月20日 
02:10:02 - 02:16:53
- 44. Speed up a grid search - My top 50 scikit-learn tips

- 44. Speed up a grid search

My top 50 scikit-learn tips
2023年04月20日 
02:16:53 - 02:19:01
- 45. Create feature interactions - My top 50 scikit-learn tips

- 45. Create feature interactions

My top 50 scikit-learn tips
2023年04月20日 
02:19:01 - 02:23:00
- 46. Ensemble multiple models - My top 50 scikit-learn tips

- 46. Ensemble multiple models

My top 50 scikit-learn tips
2023年04月20日 
02:23:00 - 02:27:23
- 47. Tune an ensemble - My top 50 scikit-learn tips

- 47. Tune an ensemble

My top 50 scikit-learn tips
2023年04月20日 
02:27:23 - 02:31:22
- 48. Run part of a Pipeline - My top 50 scikit-learn tips

- 48. Run part of a Pipeline

My top 50 scikit-learn tips
2023年04月20日 
02:31:22 - 02:34:52
- 49. Tune multiple models at once - My top 50 scikit-learn tips

- 49. Tune multiple models at once

My top 50 scikit-learn tips
2023年04月20日 
02:34:52 - 02:39:50
- 50. Solve many ML problems with one solution - My top 50 scikit-learn tips

- 50. Solve many ML problems with one solution

My top 50 scikit-learn tips
2023年04月20日 
02:39:50 - 02:47:31
Introduction - 21 more pandas tricks

Introduction

21 more pandas tricks
2022年05月13日 
00:00:00 - 00:00:36
1. Check for equality - 21 more pandas tricks

1. Check for equality

21 more pandas tricks
2022年05月13日 
00:00:36 - 00:01:27
2. Check for equality (alternative) - 21 more pandas tricks

2. Check for equality (alternative)

21 more pandas tricks
2022年05月13日 
00:01:27 - 00:02:38
3. Use NumPy without importing NumPy - 21 more pandas tricks

3. Use NumPy without importing NumPy

21 more pandas tricks
2022年05月13日 
00:02:38 - 00:03:42
4. Calculate memory usage - 21 more pandas tricks

4. Calculate memory usage

21 more pandas tricks
2022年05月13日 
00:03:42 - 00:04:10
5. Count the number of words in a column - 21 more pandas tricks

5. Count the number of words in a column

21 more pandas tricks
2022年05月13日 
00:04:10 - 00:04:45
6. Convert one set of values to another - 21 more pandas tricks

6. Convert one set of values to another

21 more pandas tricks
2022年05月13日 
00:04:45 - 00:06:59
7. Convert continuous data into categorical data (alternative) - 21 more pandas tricks

7. Convert continuous data into categorical data (alternative)

21 more pandas tricks
2022年05月13日 
00:06:59 - 00:08:05
8. Create a cross-tabulation - 21 more pandas tricks

8. Create a cross-tabulation

21 more pandas tricks
2022年05月13日 
00:08:05 - 00:08:55
9. Create a datetime column from multiple columns - 21 more pandas tricks

9. Create a datetime column from multiple columns

21 more pandas tricks
2022年05月13日 
00:08:55 - 00:09:34
10. Resample a datetime column - 21 more pandas tricks

10. Resample a datetime column

21 more pandas tricks
2022年05月13日 
00:09:34 - 00:11:07
11. Read and write from compressed files - 21 more pandas tricks

11. Read and write from compressed files

21 more pandas tricks
2022年05月13日 
00:11:07 - 00:12:10
12. Fill missing values using interpolation - 21 more pandas tricks

12. Fill missing values using interpolation

21 more pandas tricks
2022年05月13日 
00:12:10 - 00:12:45
13. Check for duplicate merge keys - 21 more pandas tricks

13. Check for duplicate merge keys

21 more pandas tricks
2022年05月13日 
00:12:45 - 00:13:50
14. Transpose a wide DataFrame - 21 more pandas tricks

14. Transpose a wide DataFrame

21 more pandas tricks
2022年05月13日 
00:13:50 - 00:14:47
15. Create an example DataFrame (alternative) - 21 more pandas tricks

15. Create an example DataFrame (alternative)

21 more pandas tricks
2022年05月13日 
00:14:47 - 00:16:06
16. Identify rows that are missing from a DataFrame - 21 more pandas tricks

16. Identify rows that are missing from a DataFrame

21 more pandas tricks
2022年05月13日 
00:16:06 - 00:17:09
17. Use query to avoid intermediate variables - 21 more pandas tricks

17. Use query to avoid intermediate variables

21 more pandas tricks
2022年05月13日 
00:17:09 - 00:19:06
18. Reshape a DataFrame from wide format to long format - 21 more pandas tricks

18. Reshape a DataFrame from wide format to long format

21 more pandas tricks
2022年05月13日 
00:19:06 - 00:21:19
19. Reverse row order (alternative) - 21 more pandas tricks

19. Reverse row order (alternative)

21 more pandas tricks
2022年05月13日 
00:21:19 - 00:22:25
20. Reverse column order (alternative) - 21 more pandas tricks

20. Reverse column order (alternative)

21 more pandas tricks
2022年05月13日 
00:22:25 - 00:23:21
21. Split a string into multiple columns (alternative) - 21 more pandas tricks

21. Split a string into multiple columns (alternative)

21 more pandas tricks
2022年05月13日 
00:23:21 - 00:24:40
Had one doubt, didn't understood the placeholder part at . - Tune multiple models simultaneously with GridSearchCV

Had one doubt, didn't understood the placeholder part at .

Tune multiple models simultaneously with GridSearchCV
2021年10月26日  gaurav malik 様 
00:02:15 - 00:05:07