R Tutorial : Categorical Response Variables in R

Want to learn more? Take the full course at https://learn.datacamp.com/courses/statistical-modeling-in-r-part-2 at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.

---

In the previous segment, we talked about effect sizes. An effect size is a number that summarizes how the output of a model changes when we change the input.

When we are looking at the effect of a quantitative input X on the output Y, the effect size is a rate, and has units of Y divided by X.

But for an effect size involving a categorical input on an output Y, the effect size is a difference and has the same units as Y.

What happens when the response variable is categorical, that is, when the output is one of a set of named levels instead of a number? This is more than a technical question. It goes to the heart of what should be the output of the model function for a categorical response variable. It turns out that providing a category as output, while natural, is very limiting. Better to give a number or set of numbers: the probabilities according to the model, of the class of interest or of all the classes.

[[3.05B]] As an example, consider a model of the categorical variable married as a function of explanatory variables like age, education, and sex.

As always, we need to have a model from which to calculate the effect size. We'll compare the model output for two different ages.

[[3.06]] As you can see, the output is the same for both ages. Does this mean that the effect size of age on married is zero: no effect of age? Not really.

Changes in categorical outputs are all or nothing: either a change or no change at all. It's as if we were tracking one individual over the years: "no change this year", "no change the next year", "still no change", "finally, a change". But our models are really about groups. For any individual, marriage is all or nothing, but for groups, we can talk about the probability of an individual being married.

[[3.07]] Many model architectures for categorical outputs do calculate the probability of each possible level of the output.

The model indicates that an extra year of age is associated with a 16 percentage point increase in the probability of being married.

#DataCamp #RTutorial #StatisticalModelinginR #IntermediateStatisticalModelinginR

#Rstats #R programming #data science #data analysis #learn R #R tutorial #data #big data #R for data science #R for data analysis #data science tutorial #data analysis tutorial #statistical modeling #DataCamp #R Tutorial #want to learn R #Data Science #how to learn data science #Machine Learning with R #Multiple explanatory variables

チャンネル登録

DataCamp

※本サイトに掲載されているチャンネル情報や動画情報はYouTube公式のAPIを使って取得・表示しています。

概要カレンダー動画一覧タイムテーブルチャンネル分析

Timetable

動画タイムテーブル

よく話題になっている単語を表示する

動画数：1662件

字幕を含める

R Tutorial : Categorical Response Variables in R

DataCamp

Timetable

よく話題になっている単語

Introduction

What is OpenAI, ChatGPT, and the OpenAI API?

What is an API?

Using the OpenAI API vs. the web interface

Why use the OpenAI API?

- Introduction

- What performance improvements will we see in generative AI models?

- What will drive LLM improvements?

- The challenges in improving LLM performance

- Transitioning from generalized to specialized models

- Other types of generative AI models that will shape the future

機械学習のまとめとは

利用規約

プライバシーポリシー

お問い合わせ

その他のデータベース