Durant l’ensemble de ce cours, vous avez été confronté à des learnr afin de vérifier l’acquisition de différents concepts liés à la science des données biologiques. Votre examen vous est proposé sous le même format.
Ce questionnaire comprend 21 questions dont 20 questions à choix multiples
Ne vous trompez pas dans votre adresse mail et votre identifiant Github
N’oubliez pas de soumettre votre réponse après chaque exercice
Conformément au RGPD (Règlement Général sur la Protection des Données), nous sommes tenus de vous informer de ce que vos résultats seront collecté afin de suivre votre progression. Les données seront enregistrées au nom de l’utilisateur apparaissant en haut de cette page. Corrigez si nécessaire ! En utilisant ce tutoriel, vous marquez expressément votre accord pour que ces données puissent être collectées par vos enseignants et utilisées pour vous aider et vous évaluer. Après avoir été anonymisées, ces données pourront également servir à des études globales dans un cadre scientifique et/ou éducatif uniquement.
Le jeu de données growth
est mis à votre disposition comprenant la variable weight
et la variable height
.
Réalisez un nuage de points avec la variable weight
en abscisse et height
en ordonnée.
Des snippets sont mis à votre disposition en fin de question
set.seed(1000)
growth <- tibble::tibble(
weight = (1:100 + rnorm(n = 100, mean = 0,sd = 0.5) + 10),
height = (0.3*weight) + rnorm(n = 100, mean = 0,sd = 3) + 30)
#chart(growth, height~ weight) +
# geom_point()
Répondez à la question ci-dessous sur base du graphique ci-dessus:
Snippets
## Charts ###############################################################################################
# ...charts
#..c
## Charts: Add ##########################################################################################
#..charts: add layers or annotations
#.ca
#.caplotly: convert last ggplot2 into interactive chart
plotly::ggplotly()
#.caylab: add or change Y label
${1:CHART} +
ylab("${2:YOUR Y LABEL HERE}")
#.caxlab: add or change X label
${1:CHART} +
xlab("${2:YOUR X LABEL HERE}")
#.catitle: add a plot title
${1:CHART} +
ggtitle("${2:YOUR TITLE HERE}")
## Charts: Multivariate #################################################################################
# ..charts: multivariate
# .cm
#.cmcorr: correlation chart
corrplot::corrplot(cor(${1:DF}[, ${2:1:3}],
use = "pairwise.complete.obs"), method = "ellipse")
#.cmxy: multivariate X-Y scatterplot
GGally::ggscatmat(as.data.frame(${1:DF}), ${2:1:3})
## Charts: Bivariate ####################################################################################
snippet ..charts: bivariate
.cb
#.cbhistfact: histogram by factor (facets)
chart(data = ${1:DF}, ~${2:XNUM} %fill=% ${3:XFACTOR} | ${3:XFACTOR}) +
geom_histogram(data = select(${1:DF}, -${3:XFACTOR}), fill = "grey", bins = ${4:30}) +
geom_histogram(bins = ${4:30}, show.legend = FALSE)
#.cberrbar2: error bars by two factors
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} %col=% ${4:XFACTOR2}) +
geom_jitter(alpha = 0.4, position = position_dodge(0.4)) +
stat_summary(geom = "point", fun.y = "mean", position = position_dodge(0.4)) +
stat_summary(geom = "errorbar", width = 0.1, position = position_dodge(0.4),
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
#.cberrbar: error bars by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_jitter(alpha = 0.4, width = 0.2) +
stat_summary(geom = "point", fun.y = "mean") +
stat_summary(geom = "errorbar", width = 0.1,
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
#.cbviolin: violinplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_violin()
#.cbbox: boxplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_boxplot()
#.cbxy: X-Y scatterplot
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XNUM}) +
geom_point()
## Charts: Univariate ###################################################################################
# ..charts: univariate
#.cu
#.cuqqchisq: QQ plot - chi-square
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "chisq", df = ${3:DEGREES_OF_FREEDOM},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqf: QQ plot - F
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "f", df1 = ${3:NUMERATOR_DF}, df2 = ${4:DENOMINATOR_DF},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqt: QQ plot - Student t
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "t", df = ${3:DEGREES_OF_FREEDOM},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqnorm: QQ plot - normal
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "norm",
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuhbar: horizontal bars
chart(data = ${1:DF}, ~factor(${2:VAR})) +
geom_bar() + coord_flip()
#.cuvbar: vertical bars
chart(data = ${1:DF}, ~factor(${2:VAR})) +
geom_bar()
#.cuhist: histogram
chart(data = ${1:DF}, ~${2:VARNUM}) +
geom_histogram(binwidth = ${3:30})
Le service de la pêche du service public de wallonie a réalisé une campagne de pêche axée sur l’ombre commun sur son territoire afin d’étudier la taille des spécimens présents dans les cours d’eau.
Le jeu de données thymallus
est mis à votre disposition comprenant la variable size
. Réalisez un histogramme afin de déterminer le mode et la symétrie.
Des snippets sont mis à votre disposition en fin de question
set.seed(1000)
thymallus <- data_frame(size = rnorm(5000, 30, 8)) %>.%
filter(., size > 0)
#chart(df, ~ size) +
# geom_histogram() +
# labs(x = "Unimodal & symétrique", y = "Effectifs")
Répondez à la question ci-dessous sur base du graphique ci-dessus:
Snippets
## Charts ###############################################################################################
# ...charts
#..c
## Charts: Add ##########################################################################################
#..charts: add layers or annotations
#.ca
#.caplotly: convert last ggplot2 into interactive chart
plotly::ggplotly()
#.caylab: add or change Y label
${1:CHART} +
ylab("${2:YOUR Y LABEL HERE}")
#.caxlab: add or change X label
${1:CHART} +
xlab("${2:YOUR X LABEL HERE}")
#.catitle: add a plot title
${1:CHART} +
ggtitle("${2:YOUR TITLE HERE}")
## Charts: Multivariate #################################################################################
# ..charts: multivariate
# .cm
#.cmcorr: correlation chart
corrplot::corrplot(cor(${1:DF}[, ${2:1:3}],
use = "pairwise.complete.obs"), method = "ellipse")
#.cmxy: multivariate X-Y scatterplot
GGally::ggscatmat(as.data.frame(${1:DF}), ${2:1:3})
## Charts: Bivariate ####################################################################################
snippet ..charts: bivariate
.cb
#.cbhistfact: histogram by factor (facets)
chart(data = ${1:DF}, ~${2:XNUM} %fill=% ${3:XFACTOR} | ${3:XFACTOR}) +
geom_histogram(data = select(${1:DF}, -${3:XFACTOR}), fill = "grey", bins = ${4:30}) +
geom_histogram(bins = ${4:30}, show.legend = FALSE)
#.cberrbar2: error bars by two factors
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} %col=% ${4:XFACTOR2}) +
geom_jitter(alpha = 0.4, position = position_dodge(0.4)) +
stat_summary(geom = "point", fun.y = "mean", position = position_dodge(0.4)) +
stat_summary(geom = "errorbar", width = 0.1, position = position_dodge(0.4),
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
#.cberrbar: error bars by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_jitter(alpha = 0.4, width = 0.2) +
stat_summary(geom = "point", fun.y = "mean") +
stat_summary(geom = "errorbar", width = 0.1,
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
#.cbviolin: violinplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_violin()
#.cbbox: boxplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_boxplot()
#.cbxy: X-Y scatterplot
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XNUM}) +
geom_point()
## Charts: Univariate ###################################################################################
# ..charts: univariate
#.cu
#.cuqqchisq: QQ plot - chi-square
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "chisq", df = ${3:DEGREES_OF_FREEDOM},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqf: QQ plot - F
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "f", df1 = ${3:NUMERATOR_DF}, df2 = ${4:DENOMINATOR_DF},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqt: QQ plot - Student t
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "t", df = ${3:DEGREES_OF_FREEDOM},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqnorm: QQ plot - normal
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "norm",
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuhbar: horizontal bars
chart(data = ${1:DF}, ~factor(${2:VAR})) +
geom_bar() + coord_flip()
#.cuvbar: vertical bars
chart(data = ${1:DF}, ~factor(${2:VAR})) +
geom_bar()
#.cuhist: histogram
chart(data = ${1:DF}, ~${2:VARNUM}) +
geom_histogram(binwidth = ${3:30})
# nom du jeu de données df
names(df)
[1] "y" "group"
summary(df)
y group
Min. :10.15 A:40
1st Qu.:18.12 B:40
Median :21.59 C:40
Mean :21.37
3rd Qu.:24.69
Max. :35.08
anova(anova. <- lm(data = df, y ~ group))
Suite à l’analyse des résultats ci-dessus, réalisez un test de comparaison multiple de Tukey.
Des snippets sont mis à votre disposition en fin de question
set.seed(43)
y <- c(rnorm(n = 40, mean = 20, sd = 5),
rnorm(n = 40, mean = 21, sd = 5),
rnorm(n = 40, mean = 22, sd = 5))
group <- rep(c("A", "B", "C"), each = 40)
df <- tibble::tibble(y = y, group = as.factor(group))
anova(anova. <- lm(data = df, y ~ group))
# ..hypothesis tests: means
#.hm
# .hmanovamult: anova - multiple comparisons [multcomp]
summary(anovaComp. <- confint(multcomp::glht(anova.,
linfct = multcomp::mcp(${1:XFACTOR} = "Tukey")))) # Add a second factor if you want
.oma <- par(oma = c(0, 5.1, 0, 0)); plot(anovaComp.); par(.oma); rm(.oma)
# .hmanovaresid: anova - residuals
residuals(anova.)
# .hmanovaqqplot: anova - residuals QQ-plot
# plot(anova., which = 2)
anova. %>.%
chart(broom::augment(.), aes(sample = .std.resid)) +
geom_qq() +
#geom_qq_line(colour = "darkgray") +
labs(x = "Theoretical quantiles", y = "Standardized residuals") +
ggtitle("Normal Q-Q")
# .hmanova2nested: two-way ANOVA (nested model)
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR} + ${4:BLOCK} %in% ${3:XFACTOR}))
# .hmanova2noint: two-way ANOVA (without interactions)
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} + ${4:XFACTOR2}))
# .hmanova2: two-way ANOVA (complete model)
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} * ${4:XFACTOR2}))
# .hmanova2desc: two-way ANOVA (description)
${1:DF} %>.%
group_by(., ${2:XFACTOR1}, ${3:XFACTOR2}) %>.%
summarise(., mean = mean(${4:YNUM}), sd = sd(${4:YNUM}), count = sum(!is.na(${4:YNUM})))
# .hmanova1: one-way ANOVA
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}))
# .hmanova1desc: one-way ANOVA (description)
${1:DF} %>.%
group_by(., ${2:XFACTOR}) %>.%
summarise(., mean = mean(${3:YNUM}), sd = sd(${3:YNUM}), count = sum(!is.na(${3:YNUM})))
# .hmttestindep: independent Student's t-test
t.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR},
alternative = "two.sided", conf.level = 0.95, var.equal = TRUE)
# .hmttestpaired: paired Student's t-test
t.test(${1:DF}\$XNUM, ${1:DF}\$YNUM,
alternative = "two.sided", conf.level = 0.95, paired = TRUE)
# .hmttestuni: univariate Student's t-test
t.test(${1:DF}\$2:XNUM,
alternative = "two.sided", mu = 0, conf.level = 0.95)
Sur base de la matrice de corrélation ci-dessous provenant de données collectées sur 200 crabes :
FL | RW | CL | CW | BD | |
---|---|---|---|---|---|
FL | 1 | 0.9051 | 0.9785 | 0.9654 | 0.9866 |
RW | 0.9051 | 1 | 0.8916 | 0.8992 | 0.8895 |
CL | 0.9785 | 0.8916 | 1 | 0.9951 | 0.9823 |
CW | 0.9654 | 0.8992 | 0.9951 | 1 | 0.9675 |
BD | 0.9866 | 0.8895 | 0.9823 | 0.9675 | 1 |
Répondez aux questions ci-dessous sur base du tableau proposé ci-dessus :
Reproduisez le graphique suivant
Warning: `data_frame()` is deprecated, use `tibble()`.
This warning is displayed once per session.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
avec le jeu de donnée df
# nom du jeu de données df
# names
names(df)
[1] "x" "y" "zone" "area"
summary(df)
x y zone area
Min. : 4.127 Min. : 13.68 A:150 1:100
1st Qu.: 85.470 1st Qu.: 94.31 B:150 2:100
Median :160.200 Median :171.84 3:100
Mean :160.770 Mean :170.66
3rd Qu.:234.662 3rd Qu.:244.18
Max. :315.348 Max. :331.31
Des snippets sont mis à votre disposition en fin de question
set.seed(43)
df <- data_frame(x = 1:300 + rnorm(n = 300, mean = 10, sd = 5),
y = x + rnorm(n = 300, mean = 10, sd = 5),
zone = as.factor(rep(c("A", "B"), times = 150)),
area = as.factor(rep(1:3, each = 100))
)
#TODO
Vous avez à votre disposition les snippets suivants
## Charts ###############################################################################################
# ...charts
# ..c
## Charts: Add ##########################################################################################
# ..charts: add layers or annotations
# .ca
# .caplotly: convert last ggplot2 into interactive chart
plotly::ggplotly()
# .caylab: add or change Y label
${1:CHART} +
ylab("${2:YOUR Y LABEL HERE}")
# .caxlab: add or change X label
${1:CHART} +
xlab("${2:YOUR X LABEL HERE}")
# .catitle: add a plot title
${1:CHART} +
ggtitle("${2:YOUR TITLE HERE}")
## Charts: Multivariate #################################################################################
# ..charts: multivariate
# .cm
# .cmcorr: correlation chart
corrplot::corrplot(cor(${1:DF}[, ${2:1:3}],
use = "pairwise.complete.obs"), method = "ellipse")
# .cmxy: multivariate X-Y scatterplot
GGally::ggscatmat(as.data.frame(${1:DF}), ${2:1:3})
## Charts: Bivariate ####################################################################################
# ..charts: bivariate
# .cb
#.cbhistfact: histogram by factor (facets)
chart(data = ${1:DF}, ~${2:XNUM} %fill=% ${3:XFACTOR} | ${3:XFACTOR}) +
geom_histogram(data = select(${1:DF}, -${3:XFACTOR}), fill = "grey", bins = ${4:30}) +
geom_histogram(bins = ${4:30}, show.legend = FALSE)
# .cberrbar2: error bars by two factors
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} %col=% ${4:XFACTOR2}) +
geom_jitter(alpha = 0.4, position = position_dodge(0.4)) +
stat_summary(geom = "point", fun.y = "mean", position = position_dodge(0.4)) +
stat_summary(geom = "errorbar", width = 0.1, position = position_dodge(0.4),
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
#.cberrbar: error bars by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_jitter(alpha = 0.4, width = 0.2) +
stat_summary(geom = "point", fun.y = "mean") +
stat_summary(geom = "errorbar", width = 0.1,
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
# .cbviolin: violinplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_violin()
# .cbbox: boxplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_boxplot()
# .cbxy: X-Y scatterplot
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XNUM}) +
geom_point()
#nom du jeu de données : df
names(df)
[1] "x"
summary(df)
x
Min. :-94.368
1st Qu.:-24.333
Median : 7.475
Mean : 9.854
3rd Qu.: 36.684
Max. :113.215
Sur base du jeu de données suivant ci-dessus , répondez à la question ci-dessous
Des snippets sont mis à votre disposition en fin de question
set.seed(43)
df <- tibble::tibble(x = rnorm(n = 60, mean = 10, sd = 50))
## Charts ###############################################################################################
# ...charts
# ..c
## Charts: Add ##########################################################################################
# ..charts: add layers or annotations
# .ca
# .caplotly: convert last ggplot2 into interactive chart
plotly::ggplotly()
# .caylab: add or change Y label
${1:CHART} +
ylab("${2:YOUR Y LABEL HERE}")
# .caxlab: add or change X label
${1:CHART} +
xlab("${2:YOUR X LABEL HERE}")
# .catitle: add a plot title
${1:CHART} +
ggtitle("${2:YOUR TITLE HERE}")
## Charts: Multivariate #################################################################################
# ..charts: multivariate
# .cm
# .cmcorr: correlation chart
corrplot::corrplot(cor(${1:DF}[, ${2:1:3}],
use = "pairwise.complete.obs"), method = "ellipse")
# .cmxy: multivariate X-Y scatterplot
GGally::ggscatmat(as.data.frame(${1:DF}), ${2:1:3})
## Charts: Bivariate ####################################################################################
# ..charts: bivariate
# .cb
#.cbhistfact: histogram by factor (facets)
chart(data = ${1:DF}, ~${2:XNUM} %fill=% ${3:XFACTOR} | ${3:XFACTOR}) +
geom_histogram(data = select(${1:DF}, -${3:XFACTOR}), fill = "grey", bins = ${4:30}) +
geom_histogram(bins = ${4:30}, show.legend = FALSE)
# .cberrbar2: error bars by two factors
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} %col=% ${4:XFACTOR2}) +
geom_jitter(alpha = 0.4, position = position_dodge(0.4)) +
stat_summary(geom = "point", fun.y = "mean", position = position_dodge(0.4)) +
stat_summary(geom = "errorbar", width = 0.1, position = position_dodge(0.4),
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
#.cberrbar: error bars by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_jitter(alpha = 0.4, width = 0.2) +
stat_summary(geom = "point", fun.y = "mean") +
stat_summary(geom = "errorbar", width = 0.1,
fun.data = "mean_cl_normal", fun.args = list(conf.int = 0.95))
# .cbviolin: violinplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_violin()
# .cbbox: boxplot by factor
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}) +
geom_boxplot()
# .cbxy: X-Y scatterplot
chart(data = ${1:DF}, ${2:YNUM} ~ ${3:XNUM}) +
geom_point()
## Charts: Univariate ###################################################################################
# ..charts: univariate
#.cu
#.cuqqchisq: QQ plot - chi-square
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "chisq", df = ${3:DEGREES_OF_FREEDOM},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqf: QQ plot - F
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "f", df1 = ${3:NUMERATOR_DF}, df2 = ${4:DENOMINATOR_DF},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqt: QQ plot - Student t
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "t", df = ${3:DEGREES_OF_FREEDOM},
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuqqnorm: QQ plot - normal
car::qqPlot(${1:DF}[["${2:XNUM}"]], distribution = "norm",
envelope = 0.95, col = "Black", ylab = "${2:XNUM}")
#.cuhbar: horizontal bars
chart(data = ${1:DF}, ~factor(${2:VAR})) +
geom_bar() + coord_flip()
#.cuvbar: vertical bars
chart(data = ${1:DF}, ~factor(${2:VAR})) +
geom_bar()
#.cuhist: histogram
chart(data = ${1:DF}, ~${2:VARNUM}) +
geom_histogram(binwidth = ${3:30})
Des scientifiques réalisent des pêches sur 100 stations d’intérêts afin d’étudier la diversité des poissons dans les cours d’eau de wallonie. Ils s’intéressent tout particulièrement à l’espèce Barbus barbus L. 1758. A la suite de leurs recensements, ils souhaitent connaitre le nombre de stations dans les zones A,C,D dont la densité relative en barbeau est supérieure à 12.5 % par rapport à l’ensemble des poissons pêchés.
Ils mettent à votre disposition un jeu de données qui se nomme density
avec deux variables les zones (area
) et la densité relative (densi
) de barbeau commun par rapport à l’ensemble des poissons pêchés.
#nom du jeu de données : density
names(density)
[1] "area" "densi"
summary(density)
area densi
A:20 Min. : 8.094
B:20 1st Qu.:10.676
C:20 Median :11.876
D:20 Mean :11.862
E:20 3rd Qu.:12.882
Max. :16.617
Vous devez retirer les zones (area
) B et E et ne garder que les valeurs de densité strictement supérieures à 12.5.
Des snippets sont mis à votre disposition en fin de question
set.seed(43)
set.seed(43)
rep(c("A", "B", "C", "D", "E"), each = 20) -> t
tt <- c(rnorm(n = 20, mean = 10, sd = 1),
rnorm(n = 20, mean = 12, sd = 1),
rnorm(n = 20, mean = 11, sd = 1),
rnorm(n = 20, mean = 12, sd = 1),
rnorm(n = 20, mean = 14, sd = 1))
density <- tibble::tibble(area = as.factor(t) , densi = tt)
density %>.%
dplyr::filter(., area %in% c("A", "C", "D") & densi > 12.5) %>.%
nrow(.)-> t111
Snippets
#.dosel select cases
DF %>.%
filter(., CONDITIONS) -> DF2
DF %>.%
select(., VAR1, VAR2) -> DF2
Identifiant | Travail | Age | Genre |
---|---|---|---|
1 | intensif | 18 | H |
2 | faible | 24 | H |
3 | moyen | 20 | F |
4 | moyen | 19 | H |
Sur base des données ci-dessus, répondez aux questions suivantes
Sur base d’un questionnaire lié à l’indice de masse corporelle et l’activité physique, les chercheurs ont classé les individus en différents niveaux d’activité physique et d’IMC. Ils obtiennent le tableau suivant :
Sous.poids | Normal | Surpoids | Obésité | |
---|---|---|---|---|
Activité physique occasionnelle | 66 | 72 | 47 | 35 |
Activité physique régulière | 70 | 62 | 16 | 22 |
Activité physique de haut niveau | 34 | 55 | 42 | 50 |
Afin de vous aider dans vos réflexions, voici la somme des lignes et des colonnes
Activité physique occasionnelle Activité physique régulière
220 170
Activité physique de haut niveau
181
Sous.poids Normal Surpoids Obésité
170 189 105 107
Vous avez à votre disposition une zone de code.
Répondez aux questions ci-dessous sur base du tableau proposé ci-dessus :
Dans un champ de maïs (Considérez le nombre de plants de maïs comme très grand), dont la taille moyenne est de 139 cm et d’écart type de 22 cm.
Des snippets sont mis à votre disposition en fin de question
Snippets
### .iu : distribution uniforme
punif(QUANTILES, min = 0, max = 1, lower.tail = TRUE)
qunif(PROBABILITIES, min = 0, max = 1, lower.tail = TRUE)
### .in distribution normale
pnorm(QUANTILES, mean = 0, sd = 1, lower.tail = TRUE)
qnorm(PROBABILITIES, mean = 0, sd = 1, lower.tail = TRUE)
### .il distribution log-normal
plnorm(QUANTILES, meanlog = 0, sdlog = 1, lower.tail = TRUE)
qlnorm(PROBABILITIES, meanlog = 0, sdlog = 1, lower.tail = TRUE)
### .it distribution de student
.mu <- 0; .s <- 1; pt((QUANTILES - .mu)/.s, df = DEGREES_OF_FREEDOM, lower.tail = TRUE)
.mu <- 0; .s <- 1; .mu + .s * qt(PROBABILITIES, df = DEGREES_OF_FREEDOM, lower.tail = TRUE)
### .ib distribution binomial
pbinom(QUANTILES, size = N_TRIALS, prob = SUCCESS_PROB, lower.tail = TRUE)
qbinom(PROBABILITIES, size = N_TRIALS, prob = SUCCESS_PROB, lower.tail = TRUE)
### .ip distribution de poisson
ppois(QUANTILES, lambda = MEAN_OCCURENCES, lower.tail = TRUE)
qpois(PROBABILITIES, lambda = MEAN_OCCURENCES, lower.tail = TRUE)
### .ic distribution chi2
pchisq(QUANTILES, df = DEGREES_OF_FREEDOM, lower.tail = TRUE)
qchisq(PROBABILITIES, df = DEGREES_OF_FREEDOM, lower.tail = TRUE)
### .if distibution de F
pf(QUANTILES, df1 = NUMERATOR_DF, df2 = DENOMINATOR_DF, lower.tail = TRUE)
set.seed(43)
weight <- tibble::tibble(weight = c(rnorm(n = 15, mean = 100, sd = 5),
rnorm(n = 15, mean = 102, sd = 5)),
area = rep(c("a", "b"), each = 15))
weight$area <- as.factor(weight$area)
Vous avez à votre disposition le jeu de données weight
dont voici quelques informations :
# nom du jeu de données : weight
# nom des variables du jeu de données
names(weight)
[1] "weight" "area"
# résumé des variables
summary(weight)
weight area
Min. : 90.47 a:15
1st Qu.: 97.71 b:15
Median :100.00
Mean :100.33
3rd Qu.:102.71
Max. :112.32
Réalisez un test de Student bilatéral avec un seuil \(\alpha\) de 0.05 et de variance inégale.
Des snippets sont mis à votre disposition en fin de question
# .hm
# .hmttestindep: independent Student's t-test
t.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}, alternative = "two.sided", conf.level = 0.95, var.equal = TRUE)
# .hmttestpaired: paired Student's t-test
t.test(${1:DF}\$XNUM, ${1:DF}\$YNUM, alternative = "two.sided", conf.level = 0.95, paired = TRUE)
# .hmttestuni: univariate Student's t-test
t.test(${1:DF}\$2:XNUM, alternative = "two.sided", mu = 0, conf.level = 0.95)
Calculez P(\(Y = 2\)) d’une distribution de poisson avec une valeur de lambda de 5.
Des snippets sont mis à votre disposition en fin de question
## Distribution: poisson #########################################################################
# ..i (d)istribution: poisson
#.ip
# .ipcumul: poisson dist. - cumulative dens. plot
plot(0:(${1:MEAN_OCCURENCES}+20), dpois(0:(${1:MEAN_OCCURENCES}+20), lambda = ${1:MEAN_OCCURENCES}), type = "s",
col = "black", xlab = "Quantiles", ylab = "Cumulative probability")
#.ipdens: poisson dist. - density plot
plot(0:(${1:MEAN_OCCURENCES}+20), dpois(0:(${1:MEAN_OCCURENCES}+20), lambda = ${1:MEAN_OCCURENCES}), type = "h",
col = "black", xlab = "Quantiles", ylab = "Probability mass")
# .iptable: poisson dist. - table of probabilities
(.table <- data.frame(occurences = 0:(${1:MEAN_OCCURENCES}+20), probability = dpois(0:(${2:MEAN_OCCURENCES}+20),
lambda = ${2:MEAN_OCCURENCES})))
# .iprandom: poisson dist. - random
rpois(${1:<N>}, lambda = ${2:MEAN_OCCURENCES})
# .ipquant: poisson dist. - quantiles
qpois(${1:PROBABILITIES}, lambda = ${2:MEAN_OCCURENCES}, lower.tail = ${3:TRUE})
# .ipproba: poisson dist. - probabilities
ppois(${1:QUANTILES}, lambda = ${2:MEAN_OCCURENCES}, lower.tail = ${3:TRUE})
## Distribution: binomial #########################################################################
# ..i (d)istribution: binomial
.ib
# .ibcumul: binomial dist. - cumulative dens. plot
plot(0:${1:N_TRIALS}, pbinom(0:${1:N_TRIALS}, size = ${1:N_TRIALS}, prob = ${2:SUCCESS_PROB), type = "s",
col = "black", xlab = "Quantiles", ylab = "Cumulative probability")
#.ibdens: binomial dist. - density plot
plot(0:${1:N_TRIALS}, dbinom(0:${1:N_TRIALS}, size = ${1:N_TRIALS}, prob = ${2:SUCCESS_PROB}), type = "h",
col = "black", xlab = "Quantiles", ylab = "Probability mass")
# .ibtable: binomial dist. - table of probabilities
(.table <- data.frame(success = 0:${1:N_TRIALS},
probability = dbinom(0:${1:N_TRIALS}, size = ${1:N_TRIALS}, prob = ${2:SUCCESS_PROB})))
# .ibrandom: binomial dist. - random
rbinom(${1:N}, size = ${2:N_TRIALS}, prob = ${3:SUCCESS_PROB})
# .ibquant: binomial dist. - quantiles
qbinom(${1:PROBABILITIES}, size = ${2:N_TRIALS}, prob = ${3:SUCCESS_PROB}, lower.tail = ${4:TRUE})
# .ibproba: binomial dist. - probabilities
pbinom(${1:QUANTILES}, size = ${2:N_TRIALS}, prob = ${3:SUCCESS_PROB}, lower.tail = ${4:TRUE})
Plusieurs naissances d’ Ailuropoda melanoleuca (David, 1868) se sont déroulés au sein du centre de recherche et d’élevage du panda géant situé dans la province du Sichuan (Chine). Les scientifiques ont pesé les nouveaux nés et voici les masses en grammes de ces derniers :
117.442 104.566 98.818 116.754 88.41
Les scientifiques souhaitent connaitre la moyenne et l’écart-type pour ces 5 nouveaux individus. Vous avez à votre disposition une zone de code afin de réaliser vos calculs.
Répondez à la question ci-dessous :
Une étude est menée au sein d’une entreprise qui s’intéresse à la parité homme-femme en son sein.
Homme Femme
2312 2165
Vous avez à votre disposition une zone de code afin de réaliser vos calculs.
Des snippets sont mis à votre disposition en fin de question
Suite à l’analyse ci-dessus, répondez à la question ci-dessous :
Snippets
## Hypothesis tests ####################################################################################
snippet ...hypothesis tests
..h
## Hypothesis tests: Correlation #########################################################################
snippet ..hypothesis tests: correlation
.hc
snippet .hccorr: correlation test
cor.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XNUM},
alternative = "two.sided", method = "pearson")
## Hypothesis tests: Variances #########################################################################
snippet ..hypothesis tests: variances
.hv
snippet .hvftest: two-variances F-test
var.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR},
alternative = "two.sided", conf.level = 0.95)
snippet .hvlevene: Levene test
car::levene.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR})
snippet .hvbartlett: Bartlett test
bartlett.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR})
## Hypothesis tests: Proportions #######################################################################
snippet ..hypothesis tests: proportions
.hp
snippet .hpuni: univariate proportion test
prop.test(rbind(table(${1:DF}\$XFACTOR)),
alternative = "two.sided", p = ${3:0.5}, conf.level = 0.95, correct = FALSE)
snippet .hpbi: bivariate proportion test
prop.test(rbind(table(${1:DF}\$XFACTOR, ${1:DF}\$YFACTOR)),
alternative = "two.sided", conf.level = 0.95, correct = FALSE)
## Hypothesis tests: Nonparametric ####################################################################
snippet ..hypothesis tests: nonparametric
.hn
snippet .hnfried: Friedman test
friedman.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR} | ${4:BLOCK})
snippet .hnkrusmult: Kruskal-Wallis - multiple comparisons [nparcomp]
summary(kw_comp. <- nparcomp::nparcomp(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}))
plot(kw_comp.)
snippet .hnkrus: Kruskal-Wallis test
kruskal.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR})
snippet .hnwilkindep: independent Wilcoxon test
wilcox.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR},
alternative = "two.sided", conf.level = 0.95)
snippet .hnwilkpaired: paired Wilcoxon test
wilcox.test(${1:DF}\$XNUM, ${1:DF}\$YNUM,
alternative = "two.sided", conf.level = 0.95, paired = TRUE)
## Hypothesis tests: Means #############################################################################
snippet ..hypothesis tests: means
.hm
snippet .hmanovamult: anova - multiple comparisons [multcomp]
summary(anovaComp. <- confint(multcomp::glht(anova.,
linfct = multcomp::mcp(${1:XFACTOR} = "Tukey")))) # Add a second factor if you want
.oma <- par(oma = c(0, 5.1, 0, 0)); plot(anovaComp.); par(.oma); rm(.oma)
snippet .hmanovaresid: anova - residuals
residuals(anova.)
snippet .hmanovaqqplot: anova - residuals QQ-plot
#plot(anova., which = 2)
anova. %>.%
chart(broom::augment(.), aes(sample = .std.resid)) +
geom_qq() +
geom_qq_line(colour = "darkgray") +
labs(x = "Theoretical quantiles", y = "Standardized residuals") +
ggtitle("Normal Q-Q")
snippet .hmanova2nested: two-way ANOVA (nested model)
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR} + ${4:BLOCK} %in% ${3:XFACTOR}))
snippet .hmanova2noint: two-way ANOVA (without interactions)
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} + ${4:XFACTOR2}))
snippet .hmanova2: two-way ANOVA (complete model)
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR1} * ${4:XFACTOR2}))
snippet .hmanova2desc: two-way ANOVA (description)
${1:DF} %>.%
group_by(., ${2:XFACTOR1}, ${3:XFACTOR2}) %>.%
summarise(., mean = mean(${4:YNUM}), sd = sd(${4:YNUM}), count = sum(!is.na(${4:YNUM})))
snippet .hmanova1: one-way ANOVA
anova(anova. <- lm(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR}))
snippet .hmanova1desc: one-way ANOVA (description)
${1:DF} %>.%
group_by(., ${2:XFACTOR}) %>.%
summarise(., mean = mean(${3:YNUM}), sd = sd(${3:YNUM}), count = sum(!is.na(${3:YNUM})))
snippet .hmttestindep: independent Student's t-test
t.test(data = ${1:DF}, ${2:YNUM} ~ ${3:XFACTOR},
alternative = "two.sided", conf.level = 0.95, var.equal = TRUE)
snippet .hmttestpaired: paired Student's t-test
t.test(${1:DF}\$XNUM, ${1:DF}\$YNUM,
alternative = "two.sided", conf.level = 0.95, paired = TRUE)
snippet .hmttestuni: univariate Student's t-test
t.test(${1:DF}\$2:XNUM,
alternative = "two.sided", mu = 0, conf.level = 0.95)
## Hypothesis tests: Distribution #######################################################################
snippet ..hypothesis tests: distribution
.hd
snippet .hdnorm: Shapiro-Wilk test of normality
shapiro.test(${1:DF}\$XNUM)
## Hypothesis tests: Contingency #######################################################################
snippet ..hypothesis tests: contingency
.hc
snippet .hcfisher: Fisher test of independence
fisher.test(${1:{TABLE})
snippet .hcchi2comp: Chi2 test (components)
round(chisq.test(${1:TABLE})[["residuals"]]^2, 2)
snippet .hcchi2bi: Chi2 test (independence)
(chi2. <- chisq.test(${1:TABLE})); cat("Expected frequencies:\n"); chi2.[["expected"]]
snippet .hcchi2uni: Chi2 test (univariate)
chisq.test(${1:TABLE}, p = ${2:PROBABILITIES}, rescale.p = FALSE)
Vous venez de terminer votre examen.
Laissez nous vos impressions sur cet outil pédagogique ou expérimentez encore dans la zone ci-dessous. Rappelez-vous que pour placer un commentaire dans une zone de code R, vous devez utilisez un dièse (#
) devant vos phrases.
# Ajout de commentaires
# ...
# Not yet...