Friday, January 16, 2009

Running TPC-H Queries on MySQL

Getting the TPC-H queries to work on MySQL isn't too hard, but it isn't always clear what to do. I took the instructions from here and converted them to work with MySQL.

Once TPC-H is started, you can create a database and tpch account:

mysql -u root -p
mysql> CREATE USER 'tpch'@'%' IDENTIFIED BY 'password';
mysql> CREATE DATABASE tpch;
mysql> GRANT ALL ON tpch.* to 'tpch'@'%';
mysql> USE tpch;
mysql> \. tpch/gen/dss.ddl

Then, in the gen directories, you can modify the query generator to generate queries that are as close as possible to what you need:

cp makefile.suite makefile
#Modify makefile to use
# CC = gcc, DATABASE=SQLSERVER, MACHINE=LINUX, WORKLOAD=TPCH
#In tpcd.h, SQLSERVER section:
# change #define SET_DBASE "use %s;\n"
# change #define SET_ROWCOUNT "limit %d;\n\n"
# change #define START_TRAN "BEGIN WORK;"
# change #define END_TRAN "COMMIT WORK;"
make

Then you can start generating database data at the right scale factor

./dbgen -s 1

And also modify the constraints/indices file so that it's compatible with mysql:

# Modify dss.ri
# use a search and replace in order to remove "CONNECT TO TPCD", remove references to "TPCD." and remove the lines "COMMIT WORK;"

Then you can load all the data into the databases with

mysql -u tpch -p
mysql> use tpch;
mysql> LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE CUSTOMER FIELDS TERMINATED BY '|';
# Same with orders, lineitem, nation, partsupp, part, region, supplier
mysql> \. dss.ri


You then need to change the case of the table names because the queries use lower-case table names I think whereas the dss.ddl uses upper-case names for the tables:

mysql> alter table NATION rename nation;
# Ditto for supplier, region, partsupp, part, orders, lineitem, customer

Finally, you can test some of the queries

cp dists.dss queries
cd queries
../qgen -c tpch -s 1 1


I think the queries still need to be modified a bit to be compatible. I think queries with a limit may need the semi-colon moved around, the precision indicator during date arithmetic in query 1 may need to be removed, and the method for naming columns in query 13 might need changing.

4 comments:

  1. While trying to load data into table using below command
    LOAD DATA LOCAL INFILE 'customer.tbl' INTO TABLE CUSTOMER FIELDS TERMINATED BY '|';

    I am getting error as The used command is not allowed with this MySQL version.
    As I am using MYSQL 5.5 so What is syntax for mysql 5.5 to load data

    ReplyDelete
    Replies
    1. http://stackoverflow.com/questions/10762239/mysql-enable-load-data-local-infile

      Delete
  2. hi, I have not been able to generate queries from the query templates with anything but the default substitutions, even when seeding the qgen different with -r. Did you have any success here? Thanks!

    ReplyDelete
    Replies
    1. I haven't done any TPC-H stuff in 10 years, so I don't know what things need to be done to run the latest TPC-H on the latest database versions.

      Delete