dataset 0.4.1
This release strengthens the handling of semantically enriched vectors and improves coercion across base R and tidyverse workflows.
Enhancements
- New S3 methods for semantically enriched logical,
Date, andPOSIXcttypes. - Expanded coercion support:
-
as_numeric(),as_character(),as_logical(),as_factor() - Optional preservation of semantic metadata (
label,unit,concept,namespace).
-
- Rewritten coercion logic for all
defined()vector types, ensuring stable and predictable behaviour.
dataset_df improvements
-
as.data.frame.dataset_df()andas_tibble.dataset_df(): - Correct handling of numeric, character, factor, and date-time columns.
- Label-aware coercion for categorical variables.
- Clear separation between attribute stripping and preservation.
Testing and robustness
- Significant increase in test coverage, including tests for all coercion paths, metadata stripping, and temporal types.
- Improved error messaging for invalid type coercion.
- More consistent printing and formatting of
definedvectors.
This update improves reliability, consistency, and interoperability of semantically enriched datasets in R.
dataset 0.4.0
CRAN release: 2025-08-26
A new CRAN release with much improved unit testing and documentation to meet the rOpenSci standards and better methods for the main s3 classes of the package.
- Rewritten vignettes.
- Improved print, summary methods for
dataset_dfanddefined. - Better handling of multible contributors in
bibrecord. - A new
dataset_to_triplesandxsd_convertfor better serialisation. - A better handling of empty nodes in RDF.
- Many bug fixes in the way semantic information is translated to RDF.
-
var_labels()now similar tolabelled::var_lables()behavior, generally haven_labelled_defined as an s3 class works better in the tidyverse. - New bibliographic helper functions for
dataset_format()andcontributor(). - Countless small bug fixes to convert to various metadata schemas edge cases, like missing contributors, formatted subjects, etc.
- Better handling of structured metadata with
subject()
dataset 0.3.9
CRAN release: 2025-05-25
- New CRAN release with many bug fixes, and improvements from peer-review.
- The
definitionattributes is renamed toconcept. - Improved printing for
definedanddataset_dfclasses. - Improved compatibility and coercion methods for base R character and numeric types.
- A clearer
bibrecordclass for extendingutils::personandutils::bibentryclasses for more modern and cleaner bibliographic references.
dataset 0.3.4027
- The new
bibrecord()class is handles is the superclass of thedublincoreanddatacite()classes; these classes have a new print method and they are conforming the current library standard DCTERMS and current repository standard DataCite; unlikeutils::bibentry(), they handle contributors and their roles, identifiers, and many other attributes. - Breaking change: the
definitionmetadata field in thedefined()class is changed to the more understandableconceptname. - The
defined()vectors print nicely, and thedataset_df()class is more readable, too. - The missing examples are present, including examples on the use of the semantically richer
orange_dfexample dataset. - Many code quality improvements and new tests.
dataset 0.3.4023
- Changed
iris_dftoorange_dfin all examples. -
xsd_convert()handles difftime classes and edge cases. - Small errors fixed in examples.
- Test coverage increased.
- The
masterbranch is renamed tomain.
dataset 0.3.4021
- Added support for generic vector methods:
length(),head(),tail(),as.vector(),as.list(), and subsetting ([,[[). - Implemented comparison methods (
==,<,>, etc.) that operate on the underlying data while maintaining semantic integrity. - Introduced custom
print()andformat()methods that summarise metadata (label, unit, definition) in a concise and human-readable manner. - Improved the
summary()method fordefinedvectors to display variable metadata and integrate seamlessly with base R statistics. - Enhanced the
c()method to validate compatibility across all semantic attributes (label,unit,definition,namespace) before concatenation. - Extended vignette with richer examples and explanations of semantic validation, namespaces, and metadata access.
-
compare_creators()internal function to add all creators to joined datasets.
This update significantly improves the usability and robustness of semantically enriched vectors in both interactive and programmatic workflows.
dataset 0.3.0
CRAN release: 2024-01-08
- Released on CRAN.
- 0.3.1. Is a minor bug fix with units test on old R releases. It does not affect the functionality of the package.
dataset 0.2.9
-
dataset_ttl_write(): write datasets to turtle format; - with helper functions
get_prefix(),get_resource_identifier(),xsd_convert(), anddataset_to_triples().
dataset 0.2.7
CRAN release: 2023-12-08
- Released on CRAN
The devel branch contains new code that is not is validated, but as a whole the package is not working consistently.
dataset 0.2.5
-
datacite()has a new interface and anas_datacite()retrieval version. See theWorking with DataCite Metadatavignette. -
dublincore()has a new interface and anas_dublincore()version. See theWorking with Dublin Core Metadatavignette.
dataset 0.2.1
CRAN release: 2023-03-18
A minor correction to avoid vignettes downloading data from the Eurostat data warehouse on CRAN. Small readability improvements in the vignette articles.
dataset 0.2.0
CRAN release: 2022-12-14
- New methods for the
dataset()s3 class:print.dataset(),summary.dataset(),subset.dataset,[.dataset,as.data.frame(). - New vignette on how to use the dataspice package programmatically for publishing dataset documentation.
- Released on CRAN.
dataset 0.1.9
CRAN release: 2022-12-02
- Incorporating minor changes from the rOpenSci and CRAN peer-reviews.
dataset 0.1.7
- After reviewing CRAN submission comments, and correcting documentation issues, submitted to rOpenSci for review before re-submitting to CRAN.
dataset 0.1.4.
Development version available on Zenodo.
-
dataset_export()is implemented with filetype = ‘csv’. - Replacement functions are added to simple properties
identifier(),publisher(),publication_year(),language(),description(),datasource_get()anddatasource_set()[to avoid confusion with the base R source() function],geolocation(),rights(),version(). - Functions to work with structured referential metadata:
dataset_title(),subject(),subject_create().
dataset 0.1.3.
- Vignette articles started to develop and consult the development plan of the project. See From dataset To RDF, Export and Publish A dataset Object, Datasets with FAIR metadata, all comments are welcome.
- New functions:
download_dataset(),datacite(), and thedataset()constructor.
dataset 0.1.2.
- The definition of the
dataset()class, an improved data.frame (tibble, DT) R object with standardized structure and metadata. - Adding and reading DublinCore metadata and DataCite mandatory and recommended FAIR metadata metadata.
dataset 0.1.0.
First development version release.
- Added the
Motivation of the dataset packagevignette article, which is later replaced with Design Principles & Future Work Semantically Enriched, Standards-Aligned Datasets in R.
