Dataverse: v5.11 Release

Release date:
June 13, 2022
Previous version:
v5.10.1 (released April 6, 2022)
Magnitude:
5,501 Diff Delta
Contributors:
20 total committers
Data confidence:
Commits:

75 Features Released with v5.11

Top Contributors in v5.11

qqmyers
sekmiller
pdurbin
poikilotherm
ErykKul
lubitchv
landreev
JayanthyChengan
pkiraly
donsizemore

Directory Browser for v5.11

We haven't yet finished calculating and confirming the files and directories changed in this release. Please check back soon.

Release Notes Published

Dataverse Software 5.11

This release brings new features, enhancements, and bug fixes to the Dataverse Software. Thank you to all of the community members who contributed code, suggestions, bug reports, and other assistance across the project.

Release Highlights

Terms of Access or Request Access Required for Restricted Files

Beginning in this release, datasets with restricted files must have either Terms of Access or Request Access enabled. This change is to ensure that for each file in a Dataverse installation there is a clear path to get to the data, either through requesting access to the data or to provide context about why requesting access is not enabled.

Published datasets are not affected by this change. Datasets that are in draft and that have neither Terms of Access nor Request Access enabled must be updated to select one or the other (or both). Otherwise, datasets cannot be futher edited or published. Dataset authors will be able to tell if their dataset is affected by the presence of the following message at the top of their dataset (when they are logged in):

"Datasets with restricted files are required to have Request Access enabled or Terms of Access to help people access the data. Please edit the dataset to confirm Request Access or provide Terms of Access to be in compliance with the policy."

At this point, authors should click "Edit Dataset" then "Terms" and then check the box for "Request Access" or fill in "Terms of Access for Restricted Files" (or both). Afterwards, authors will be able to further edit metadata and publish.

In the "Notes for Dataverse Installation Administrators" section, we have provided a query to help proactively identify datasets that need to be updated.

See also Issue #8191 and PR #8308.

Muting Notifications

Users can control which notifications they receive if the system is configured to allow this. See also Issue #7492 and PR #8530.

Major Use Cases and Infrastructure Enhancements

Changes and fixes in this release include:

  • Terms of Access or Request Access required for restricted files. (Issue #8191, PR #8308)
  • Users can control which notifications they receive if the system is configured to allow this. (Issue #7492, PR #8530)
  • A 500 error was occuring when creating a dataset if a template did not have an associated "termsofuseandaccess". See "Legacy Templates Issue" below for details. (Issue #8599, PR #8789)
  • Tabular ingest can be skipped via API. (Issue #8525, PR #8532)
  • The "Verify Email" button has been changed to "Send Verification Email" and rather than sometimes showing a popup now always sends a fresh verification email (and invalidates previous verification emails). (Issue #8227, PR #8579)
  • For Shibboleth users, the emailconfirmed timestamp is now set on login and the UI should show "Verified". (Issue #5663, PR #8579)
  • Information about the license selection (or custom terms) is now available in the confirmation popup when contributors click "Submit for Review". Previously, this was only available in the confirmation popup for the "Publish" button, which contributors do not see. (Issue #8561, PR #8691)
  • For installations configured to support multiple languages, controlled vocabulary fields that do not allow multiple entries (e.g. journalArticleType) are now indexed properly. (Issue #8595, PR #8601, PR #8624)
  • Two-letter ISO-639-1 codes for languages are now supported, in metadata imports and harvesting. (Issue #8139, PR #8689)
  • The API endpoint for listing notifications has been enhanced to show the subject, text, and timestamp of notifications. (Issue #8487, PR #8530)
  • The API Guide has been updated to explain that the Content-type header is now (as of Dataverse 5.6) necessary to create datasets via native API. (Issue #8663, PR #8676)
  • Admin API endpoints have been added to find and delete dataset templates. (Issue 8600, PR #8706)
  • The BagIt file handler detects and transforms zip files with a BagIt package format into Dataverse data files, validating checksums along the way. See the BagIt File Handler section of the Installation Guide for details. (Issue #8608, PR #8677)
  • For BagIt Export, the number of threads used when zipping data files into an archival bag is now configurable using the :BagGeneratorThreads database setting. (Issue #8602, PR #8606)
  • PostgreSQL 14 can now be used (though we've tested mostly with 13). PostgreSQL 10+ is required. (Issue #8295, PR #8296)
  • As always, widgets can be embedded in the <iframe> HTML tag, but the HTTP header "Content-Security-Policy" is now being sent on non-widget pages to prevent them from being embedded. (PR #8662)
  • URIs in the the experimental Semantic API have changed (details below). (Issue #8533, PR #8592)
  • Installations running Make Data Count can upgrade to Counter Processor-0.1.04. (Issue #8380, PR #8391)
  • PrimeFaces, the UI framework we use, has been upgraded from 10 to 11. (Issue #8456, PR #8652)

Notes for Dataverse Installation Administrators

Identifying Datasets Requiring Terms of Access or Request Access Changes

In support of the change to require either Terms of Access or Request Access for all restricted files (see above for details), we have provided a query to identify datasets in your installation where at least one restricted file has neither Terms of Access nor Request Access enabled:

https://github.com/IQSS/dataverse/blob/v5.11/scripts/issues/8191/datasets_without_toa_or_request_access

This will allow you to reach out to those dataset owners as appropriate.

Legacy Templates Issue

When custom license functionality was added, dataverses that had older legacy templates as their default template would not allow the creation of a new dataset (500 error).

This occurred because those legacy templates did not have an associated termsofuseandaccess linked to them.

In this release, we run a script that creates a default empty termsofuseandaccess for each of these templates and links them.

Note the termsofuseandaccess that are created this way default to using the license with id=1 (cc0) and the fileaccessrequest to false.

See also Issue #8599 and PR #8789.

PostgreSQL Version 10+ Required

This release upgrades the bundled PostgreSQL JDBC driver to support major version 14.

Note that the newer PostgreSQL driver required a Flyway version bump, which entails positive and negative consequences:

  • The newer version of Flyway supports PostgreSQL 14 and includes a number of security fixes.
  • As of version 8.0 the Flyway Community Edition dropped support for PostgreSQL 9.6 and older.

This means that as foreshadowed in the 5.10 and 5.10.1 release notes, version 10 or higher of PostgreSQL is now required. For suggested upgrade steps, please see "PostgreSQL Update" in the release notes for 5.10: https://github.com/IQSS/dataverse/releases/tag/v5.10

Counter Processor 0.1.04 Support

This release includes support for counter-processor-0.1.04 for processing Make Data Count metrics. If you are running Make Data Counts support, you should reinstall/reconfigure counter-processor as described in the latest Guides. (For existing installations, note that counter-processor-0.1.04 requires a newer version of Python so you will need to follow the full counter-processor install. Also note that if you configure the new version the same way, it will reprocess the days in the current month when it is first run. This is normal and will not affect the metrics in Dataverse.)

New JVM Options and DB Settings

The following DB settings have been added:

  • :ShowMuteOptions
  • :AlwaysMuted
  • :NeverMuted
  • :CreateDataFilesMaxErrorsToDisplay
  • :BagItHandlerEnabled
  • :BagValidatorJobPoolSize
  • :BagValidatorMaxErrors
  • :BagValidatorJobWaitInterval
  • :BagGeneratorThreads

See the Database Settings section of the Guides for more information.

Notes for Developers and Integrators

See the "Backward Incompatibilities" section below.

Backward Incompatibilities

Semantic API Changes

This release includes an update to the experimental semantic API and the underlying assignment of URIs to metadata block terms that are not explicitly mapped to terms in community vocabularies. The change affects the output of the OAI_ORE metadata export, the OAI_ORE file in archival bags, and the input/output allowed for those terms in the semantic API.

For those updating integrating code or existing files intended for input into this release of Dataverse, URIs of the form...

https://dataverse.org/schema/<block name>/<parentField name>#<childField title>

and

https://dataverse.org/schema/<block name>/<Field title>

...are both replaced with URIs of the form...

https://dataverse.org/schema/<block name>/<Field name>.

Create Dataset API Requires Content-type Header (Since 5.6)

Due to a code change introduced in Dataverse 5.6, calls to the native API without the Content-type header will fail to create a dataset. The API Guide has been updated to indicate the necessity of this header: https://guides.dataverse.org/en/5.11/api/native-api.html#create-a-dataset-in-a-dataverse-collection

Complete List of Changes

For the complete list of code changes in this release, see the 5.11 Milestone in GitHub.

For help with upgrading, installing, or general questions please post to the Dataverse Community Google Group or email [email protected].

Installation

If this is a new installation, please see our Installation Guide. Please also contact us to get added to the Dataverse Project Map if you have not done so already.

Upgrade Instructions

0. These instructions assume that you've already successfully upgraded from Dataverse Software 4.x to Dataverse Software 5 following the instructions in the Dataverse Software 5 Release Notes. After upgrading from the 4.x series to 5.0, you should progress through the other 5.x releases before attempting the upgrade to 5.11.

If you are running Payara as a non-root user (and you should be!), remember not to execute the commands below as root. Use sudo to change to that user first. For example, sudo -i -u dataverse if dataverse is your dedicated application user.

In the following commands we assume that Payara 5 is installed in /usr/local/payara5. If not, adjust as needed.

export PAYARA=/usr/local/payara5

(or setenv PAYARA /usr/local/payara5 if you are using a csh-like shell)

1. Undeploy the previous version.

  • $PAYARA/bin/asadmin list-applications
  • $PAYARA/bin/asadmin undeploy dataverse<-version>

2. Stop Payara and remove the generated directory

  • service payara stop
  • rm -rf $PAYARA/glassfish/domains/domain1/generated

3. Start Payara

  • service payara start

4. Deploy this version.

  • $PAYARA/bin/asadmin deploy dataverse-5.11.war

5. Restart Payara

  • service payara stop
  • service payara start

6. Reload citation metadata block

wget https://github.com/IQSS/dataverse/releases/download/v5.11/citation.tsv curl http://localhost:8080/api/admin/datasetfield/load -X POST --data-binary @citation.tsv -H "Content-type: text/tab-separated-values"

7. Update Solr schema.xml

Note that if you have custom metadata blocks you can skip this step and proceed to the next one.

Edit schema.xml and for journalArticleType change multiValued from "false" to "true" and then restart Solr. Alternatively, download and use the version from https://github.com/IQSS/dataverse/releases/download/v5.11/schema.xml . By default the file can be found at /usr/local/solr/solr-8.11.1/server/solr/collection1/conf/schema.xml.

7b. For installations with custom metadata blocks

Use the script provided in the release to add the custom fields to the base schema.xml installed in the previous step.

   wget https://github.com/IQSS/dataverse/releases/download/v5.11/update-fields.sh
   chmod +x update-fields.sh
   curl "http://localhost:8080/api/admin/index/solr/schema" | ./update-fields.sh /usr/local/solr/solr-8.11.1/server/solr/collection1/conf/schema.xml

(Note that the curl command above calls the admin API on localhost to obtain the list of the custom fields. In the unlikely case that you are running the main Dataverse Application and Solr on different servers, generate the schema.xml on the application node, then copy it onto the Solr server.)

8. Re-export metadata files (only OAI_ORE is affected)

People archiving Bags should re-archive. Follow the directions in the Admin Guide

9. (Optional) Delete duplicate templates in database

Prior to this release making a copy of a dataset template was creating two copies, only one of which is visible in the dataverse collection and usable. The other was not being assigned a collection was invisible to the user (#8600).

If you would like to remove these orphan templates you may run the following script:

https://github.com/IQSS/dataverse/blob/v5.11/scripts/issues/8600/delete_orphan_templates_8600.sh

Also, admin APIs for finding and deleting templates have been added: https://guides.dataverse.org/en/5.11/api/native-api.html#list-dataset-templates