Rakuten Data Release[Japanese]

Purpose

Rakuten Institute of Technology is delighted to announce the Rakuten Data Release, an opportunity for researchers in universities and public research institutions to use various Rakuten data for research purposes. We highlight our open data mission below:

Contribution

Contribute to the development of applied technology in academic fields as an innovative IT company.

Acceleration

Accelerate evolution cycles in technology by strengthening links between academia and enterprises.

Promotion

Promote unique research by holding symposiums and support application development using large scale data.

Details

We have been enhancing the Rakuten Data Release every year starting from 2010. In the past 4 years, we have released all product data and review data from Rakuten Ichiba, facility data and review data from Rakuten Travel & GORA (Rakuten's golf service), as well as recipe data and images from Rakuten Recipe. In our 2014 release, we added annotated data for the first time.

Rakuten Ichiba

All product data (Approx. 156 million items), review data (Approx. 64 million reviews)

Rakuten Travel

Facility data (Approx. 128,000 facilities), review data (Approx. 6.2 million reviews)

GORA (Rakuten's golf service)

Facility data (1,669 facilities), review data (320,000 reviews)

Rakuten Recipe

Recipe data (Approx. 800,000 recipes), recipe images (Approx. 800,000 images), Pickup recipe (1,854 recipes), Daylicious news (362 news)

Rakuten Viki

Video data (623 videos), user behavior (Approx. 4.9 million records)

PriceMinister

User review (training:80,000 records / test:36,395 records), products reviews interests (training: 80,000 records)

Annotated data

- Tsukuba sentiment-tagged corpus (TSUKUBA corpus): Corpus with sentiment polarity information for each sentence of Rakuten Travel's review data provided by University of Tsukuba
- Product images dataset with category labels: Image dataset of products which belong to Rakuten genres corresponding to some categories in Caltech-256 dataset
- Images with character area: Images with rectangle coordinates of character area


Application Procedure

Thanks to the continuous support from NII and ALAGIN, Rakuten Data Release will be available for download from their websites. Prior to handing over the data, we require an application to one of the two organizations below, as well as a signature on the agreement contract. Links and a password for download will be sent to the applicant after the agreement is signed. Please access one of the following websites to apply.

NII



ALAGIN



Data samples

Rakuten Ichiba
Product data User review data
Rakuten Travel
User evaluation data User review data Hotel master data
GORA
Golf course data Course data User review data
Rakuten Recipe
Recipe information Ingredient information Process information "I made it!" report information Recipe image Pickup recipe Daylicious
Rakuten Viki
video attributes video casts user attributes behavior training
PriceMinister
user review (training) user review (test) products reviews interests (training)
annotated data
Tsukuba sentiment-tagged corpus product images dataset with category label images with character area