[Rdap] saving server space in institutional repositories

O'Donnell, Megan N [LIB] mno at iastate.edu
Wed Sep 6 11:42:07 EDT 2017


One alternative way for the state of Utah to grow their data collection would be to host/mirror metadata records for the library's data that is also related to the state. Maybe you can make a specific collection for them to harvest once you are up and running? I would encourage this over a full-transfer; LOCKSS works! 

Iowa also has a government "open data" portal: https://data.iowa.gov/ 
It has a lot of useful data in it but it is only what I would call "operational" or "administrative" in nature, and not generated by research.

-Megan

Megan O'Donnell
Data Services Librarian
Entomology, EEOB, NREM, and Environment Librarian
Iowa State University Library
mno at iastate.edu   (515) 294-1670
Impact Story  ORCiD: 0000-0002-4632-6642
Personal pronouns: she/her






-----Original Message-----
From: Rdap [mailto:rdap-bounces at asist.org] On Behalf Of Joe Hourcle
Sent: Tuesday, September 05, 2017 8:26 PM
To: Daureen Nesdill <daureen.nesdill at utah.edu>
Cc: Research Data, Access and Preservation <rdap at asist.org>
Subject: Re: [Rdap] saving server space in institutional repositories



On Tue, 5 Sep 2017, Daureen Nesdill wrote:

> Hi Joe
>
> Thanks for your comments.
>
> Below in green

Um ... I'm using a plain text client (no risk of it opening attachments or executing javascript on me).  But I know what was mine.

> -----Original Message-----
> From: Joe Hourcle [mailto:oneiros at grace.nascom.nasa.gov]
>
>
>> On Tue, 5 Sep 2017, Daureen Nesdill wrote:
>>
>>> Hi
>>>
>>> In 2014 the Data Act was passed
>>> https://www.usaspending.gov/Pages/data-act.aspx to increase 
>>> transparency and accountability in government. As a result states 
>>> and cities have been developing portals to their open data. Utah is 
>>> one of those states : https://utah.gov/digital/ https://opendata.utah.gov/.
>>> They are looking for any data related to the state. Guess what? 
>>> There is a lot of research at the U of Utah related to the state - 
>>> health, environment, disaster relief and recovery, fire, land use, 
>>> water quality, etc.
>>>
>>> If all the data related to research about the state is hosted on 
>>> state servers at state expense, then the library does not have to 
>>> host it and save server space and save $$$$.
>>>
>>> Anyone else working with their state IT?

>> Before you shift everything to them, you should check to see who is 
>> considered responsible for the data if it was generated as part of a 
>> grant.  If it's the university, you'd probably want to keep a dark 
>> copy, just in case the state archives loses it.

> It is not the state archive but the IT department in the governor's 
> office. We actually have nothing in our repository - it is still in 
> beta.
>
> And yes we are looking into :
>
> The UU owns the data so will the UU allow researchers to give it away 
> to the state? (faculty senate?)

The way that you worded it, it sounds like you're just giving a copy to the state.  What you're proposing to do is ceeding responsibility for its preservation to the state.  And I would *not* trust an IT department to do that.  State Archives would understand the implications, IT would not.

> Data generated from research performed on this campus must stay on 
> campus. So we give a second copy to the State and do not save $$$

Not necessarily true.  You could move the local copy to a lower class of storage (eg. JBOD, to be restored from the state should something go
wrong)

> Do we need to get legal involved and draw up an agreement with the 
> state?

I would.  At the very least you need a Memorandum of Understanding, spelling out what each group is responsible for.  You may also want something like a Service Level Agreement, but those are usually for IT services where money's changing hands.  (the service provider specifies what sort of uptime guarantee & minimum level of service (bandwidth, etc.) will be available, or you don't have to pay for some period (month, week,
etc)


> Contractual agreements with funders may indicate an entity other than 
> the UU owns the data.

And in those cases, you may not be able to transfer the data to another group (at all, much less give them responsibility for it), but you also may have restrictions on distribution (which after having worked in IT for 
20+ years, I wouldn't trust to a run-of-the-mill IT department)

You can run into problems where the group generating the data obtained restricted data from some other group, and may only use it for a very narrow purpose.  This seems to come up more typically with non-US groups (where data *can* be copyrighted) as part of a data sharing agreement.

There can also be legal restrictions -- HIPAA for human research (use has to go through IRB approval), ITAR for certain types of physics and engineering data, etc.

Animal & ecology data can also be sensitive -- research on endangered species could be used by poachers.

Our group has the luxury of *only* distributed unrestricted data (and we won't accept restricted data), but part of that's because dealing with authentication & authorization at NASA requires a ton of extra hoops to jump through ... and if you're dealing with foreign nationals, it's even worse.


>> And I admit that it's been a while since I talked to anyone from the 
>> National Archive, but when they had the second release of 'data.gov', 
>> I was chatting with someone from there, and I remember that the 
>> amount of digital data that they were dealing with was orders of 
>> magnitude less than what our group did.  (and that's not even all of 
>> NASA).  I wouldn't  be surprised if the same was true for state archives.
>
> 23,000 datasets, but that's not the point. The state wants to grow 
> their open data repository.


Um ... that seems like they're looking for a reason to justify their existance.  What they should really should be doing is two things :

1. A registry of data available that meets their inclusion criteria 2. A repository for organizations that don't have a suitable system to
    serve the data to the public and/or preserve it for the long term (or
    at least the time frame required by law)

One of the big problems is that government IT departments can be at the whim of politicians -- suddenly replaced by contractors ... or everything has to move to some new system (from a company owned by a one of their campaign donors), etc.

And let's not forget the websites and databases that are now dark because of a change in administration and the attempts to scrub anything related to 'climate change'.

I would much rather have an archival group in charge of the authoritative record, rather than an IT department.


-Joe

ps. And after that incident w/ HTTPS-Only ... insert standard disclaimer
     about this being my own opinion and not that of my place of work ...
     although I've worked in government IT for 15+ years (state & fed
     levels), was an elected official for 6 years, and worked in university
     IT for ~7 years, so I do have some experience that influenced by
     response.


-----
Joe Hourcle
Programmer/Analyst
Solar Data Analysis Center
Goddard Space Flight Center
_______________________________________________
Rdap mailing list
Rdap at mail.asis.org
http://mail.asis.org/mailman/listinfo/rdap




More information about the RDAP mailing list