The survey ClassSource:
You need the latest development version of declared.
The survey class will be derived from the
difficulty_bills <- declared ( c(0,1,2,-1,0), labels = c(Never = 0, Time_to_time = 1, Always = 2, DK = -1) ) age_exact <- declared ( c( 34,45,21,55,-1), labels = c( A = 34,A = 45,A = 21, A= 55, DK = -1) ) listen_spotify <- declared ( c(0,1,9,0,1), labels = c( No = 0, Yes = 1,Inap = 9), na_values = 9 )
raw_survey <- data.frame ( obs_id = obs_id, geo = geo, listen_spotify = listen_spotify, sex = sex, age_exact = age_exact, difficulty_bills = difficulty_bills ) survey_dataset <- dataset( x= raw_survey, Dimensions = "geo", Measures = c("listen_spotify", "sex", "age_exact", "difficulty_bills"), Attributes = NULL, sdmx_attributes = "geo", Title = "Tiny Survey", Creator = person("Jane", "Doe"))
dublincore(survey_dataset) #> Title: Tiny Survey #> Publiser: | Source: | Date: 19340 | Language: | Identifier: | Rights: | Description: NA | #> names: obs_id, geo, listen_spotify, sex, age_exact, difficulty_bills #> - dimensions: geo (character) #> - measures: listen_spotify (declared|integer) sex (declared|integer) age_exact (declared|integer) difficulty_bills (declared|integer) #> - attributes: <none>
It is a good practice to define valid, but not present labels in
declared, because in the retrospective harmonization
workflow they may be concatenated (binded) together with further
observations that do have the currently not used label.
In this example, the
DK or declined label is not in
print(listen_spotify) #> <declared<integer>> #>  0 1 NA(9) 0 1 #> Missing values: 9, -1 #> #> Labels: #> value label #> 0 No #> 1 Yes #> 9 Inap #> -1 DK
summary(listen_spotify) #> Min. 1st Qu. Median Mean 3rd Qu. Max. NA's #> 0.0 0.0 0.5 0.5 1.0 1.0 1
survey_dataset <- dublincore_add(survey_dataset, Title = "Tiny Survey", Creator = person("Daniel", "Antal"), Identifier = "https://doi.org/xxxx.yyyyy", Publisher = "Reprex", Date = 2022, Subject = "Surveys", Language = "en")
survey class inherits elements of the
dataset class, but it will be more strictly defined. I am
considering to make
declared every single column except for
numeric types with
DK would map nicely to
CL_OBS_STATUS SDMX codes that make missing observation
explicit, and try to categorize them.
dublincore(survey_dataset) #> Title: Tiny Survey #> Publiser: Reprex | Source: | Date: 19340 | Language: eng | Identifier: https://doi.org/xxxx.yyyyy | Rights: | Description: | #> names: obs_id, geo, listen_spotify, sex, age_exact, difficulty_bills #> - dimensions: geo (character) #> - measures: listen_spotify (declared|integer) sex (declared|integer) age_exact (declared|integer) difficulty_bills (declared|integer) #> - attributes: <none>
summary method implemented for
will need new
summary(survey_dataset) #> Tiny Survey [https://doi.org/xxxx.yyyyy] by Daniel Antal #> Published by Reprex #> obs_id geo listen_spotify sex #> Length:5 Length:5 Min. :0.0 Min. :0.00 #> Class :character Class :character 1st Qu.:0.0 1st Qu.:0.75 #> Mode :character Mode :character Median :0.5 Median :1.00 #> Mean :0.5 Mean :0.75 #> 3rd Qu.:1.0 3rd Qu.:1.00 #> Max. :1.0 Max. :1.00 #> NA's :1 NA's :1 #> age_exact difficulty_bills #> Min. :-1.0 Min. :-1.0 #> 1st Qu.:21.0 1st Qu.: 0.0 #> Median :34.0 Median : 0.0 #> Mean :30.8 Mean : 0.4 #> 3rd Qu.:45.0 3rd Qu.: 1.0 #> Max. :55.0 Max. : 2.0 #>
survey (should) contain the entire processing
history from creation, and optionally the
for publication created with
datacite_add(). A similar
dublincore_add function uses the Dublin Core metadata
Eventually, a connection to the packages zen4R will make sure that the correctly described dataset can get a Zenodo record, receive a DOI, the DOI recorded in the object, and upload to Zenodo.