Where can I find sample databases with common formatted data that I can use in multiple database engines?
Does anybody know of any sample databases I could download, preferably in CSV or some similar easy to import format so that I could get more practice in working with different types of data sets?
I know that the Canadian Department of Environment has historical weather data that you can download. However, it's not in a common format I can import into any other database. Moreover, you can only run queries based on the included program, which is actually quite limited in what kind of data it can provide.
Does anybody know of any interesting data sets that are freely available in a common format that I could use with mySql, Sql Server, and other types of database engines?
12 answers given for "Where can I find sample databases with common formatted data that I can use in multiple database engines?"
Accepted Solution
The datawrangling blog posted a nice list a while back:
http://www.datawrangling.com/some-datasets-available-on-the-web
Includes financial, government data (labor, housing, etc.), and too many more to list here.
A lot of the data in Stack Overflow is licensed under the create commons. Every 3 months they release a data dump with all the questions, answers, comments, and votes.
For Microsoft SQL Server, there is the AdventureWorks.
For MySQL there are quite a few sample database at http://dev.mysql.com/doc/index-other.html
- world (world countries and cities)
- sakila(video rental)
- employee
- menagerie
I use generatedata.com to generate custom databases schemes with entries.
To use it, you can simply register a new account, or download its sources and install it on your server.
You can export generated code in SQL, XML, JSON, or even server-side scripting language like php etc.
Swivel are both good sources for data. Any database should be able to import CSV files.
What database engine are you importing into? That will help determine what formats you can include in your search.
The Federal Energy Regulatory Commission has some sample data for download in CSV format.
The Guardian newspaper in the UK has a data-store, http://www.guardian.co.uk/data-store, full of categorized datasets. They're all ultimately stored as Google Documents, so you can export them into csv & Excel.
There's a whole bunch of free SQL Server sample databases on CodePlex: http://www.codeplex.com/Wikipage?ProjectName=SqlServerSamples#databases
One very simple way to get sample data is use full applications. I needed some sample data to practice what I was learning with MySQL at the time and just downloaded PHPBB and used their provided database. If you need to add users etc, just use the program to do it.
Think generic. You can get weather data from common sources for free, thetvdb.com has a pretty nifty set of data for TV show episodes for free, sites like last.fm have a tonne of data available for music listening habits. If you just want sample data, the easiest way to get it is not thinking in terms of "I want a database". Think "what freely available data is out there".
For FileMaker, see Sample Database: http://www.yzysoft.com/printouts/yzy_soft___Sample_Database.html
You can probably find the Northwind sample database for SQLServer
It might be overkill but you can install OracleXE, I think it comes with some sample schemas or you can find the old Scott schema online.
Also, in stephen bohlen's Summer of NHibernate screen-cast series he creates a sample database, the code comes with it in xml files and you can import it like he describes in the screencast (maybe episode 2 or 3) and just not delete it later.
For Firebird you have employee.fdb
on windows OS, it is located there C:\Program Files\Firebird\Firebird_2_1\examples\empbuild