我试图使用IMDBPY 5.1将 imdb 数据加载到 mysql 数据库。但我最后总是遇到以下问题。我需要在架构中有完整的主键 + 外键约束。谁能给我一些提示,可能是什么原因?
building database indexes (this may take a while)
# TIME createIndexes() : 38min, 6sec (wall) 0min, 0sec (user) 0min, 0sec (system)
adding foreign keys (this may take a while)
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_d`, CONSTRAINT `title_episode_of_id_exists` FOREIGN KEY (`episode_of_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `aka_title_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `cast_info_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `complete_cast_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `movie_keyword_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `movie_link_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `movie_info_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `movie_info_idx_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
ERROR caught exception creating a foreign key: Cannot add or update a child row: a foreign key constraint fails (`imdb`.`#sql-65bf_e`, CONSTRAINT `movie_companies_movie_id_exists` FOREIGN KEY (`movie_id`) REFERENCES `title` (`id`))
# TIME createForeignKeys() : 655min, 16sec (wall) 0min, 0sec (user) 0min, 0sec (system)
RESTORING imdbIDs values for movies... WARNING: unable to restore imdbIDs using the temporary table (falling back to dbm): missing "title_extract" table (ok if this is the first run)
WARNING: unable to restore imdbIDs (ok if this is the first run)
RESTORING imdbIDs values for people... WARNING: unable to restore imdbIDs using the temporary table (falling back to dbm): missing "name_extract" table (ok if this is the first run)
WARNING: unable to restore imdbIDs (ok if this is the first run)
RESTORING imdbIDs values for characters... WARNING: unable to restore imdbIDs using the temporary table (falling back to dbm): missing "char_name_extract" table (ok if this is the first run)
WARNING: unable to restore imdbIDs (ok if this is the first run)
RESTORING imdbIDs values for companies... WARNING: unable to restore imdbIDs using the temporary table (falling back to dbm): missing "company_name_extract" table (ok if this is the first run)
WARNING: unable to restore imdbIDs (ok if this is the first run)
我使用的命令如下
1)安装所有必需的软件包。
sudo apt-get install -y gcc python python-dev libssl-dev libxml2-dev libxslt1-dev zlib1g-dev python-setuptools python-pip
easy_install -U SQLObject
pip install MySQL-python
2) 安装 IMDBPY。
cd [IMDBPY_parent_directory]
wget http://prdownloads.sourceforge.net/imdbpy/IMDbPY-5.1.tar.gz
tar -xzf IMDbPY-5.1.tar.gz
cd IMDbPY-5.1
python setup.py install
3)在mysql中,创建一个数据库“imdb”,并将所有权限授予“user”,密码为“password”。
CREATE DATABASE imdb;
GRANT ALL PRIVILEGES ON imdb.* TO 'user'@'localhost' IDENTIFIED BY 'password';
FLUSH PRIVILEGES;
4) 下载所有 IMDB 数据。
mkdir [imdb_data_directory]
cd [imdb_data_directory]
wget -r --accept="*.gz" --no-directories --no-host-directories --level 1 ftp://ftp.fu-berlin.de/pub/misc/movies/database/
5)加载IMDB数据到mysql。
cd [IMDBPY_parent_directory]/IMDbPY-5.1/bin
python imdbpy2sql.py -d [imdb_data_directory] -u
'mysql://user:password@localhost/imdb'
我的设置是:
- 蟒蛇:2.7
- mysql:5.7
- Ubuntu 16.04
我还尝试了 macOS 10.12 + mysql 5.7 + python 2.7 并遇到了同样的问题。
对于 Davide 使用 SQLAlchemy 而不是 SQLObject 的建议:
我使用以下命令尝试了 sqlalchemy:
python imdbpy2sql.py -d [imdb_file_directory] -o sqlalchemy -u 'mysql://user:password@localhost/imdb?charset=utf8&local_infile=1'
我收到以下错误。
Traceback (most recent call last):
File "imdbpy2sql.py", line 538, in <module>
conn = setConnection(URI, DB_TABLES)
File "/Library/Python/2.7/site-packages/IMDbPY-5.1-py2.7-macosx-10.12-intel.egg/imdb/parser/sql/alchemyadapter.py", line 489, in setConnection
engine = create_engine(uri, **params)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/__init__.py", line 387, in create_engine
return strategy.create(*args, **kwargs)
File "/Library/Python/2.7/site-packages/sqlalchemy/engine/strategies.py", line 160, in create
engineclass.__name__))
TypeError: Invalid argument(s) 'local_infile' sent to create_engine(), using configuration MySQLDialect_mysqldb/QueuePool/Engine. Please check that the keyword arguments are appropriate for this combination of components.
我使用的是 sqlalchemy 1.1.8 版。当我切换到 sqlalchemy 0.5 版时,我收到了同样的错误。当我切换到 sqlalchemy 0.4 时,我收到了这个错误:
Traceback (most recent call last):
File "imdbpy2sql.py", line 323, in <module>
from imdb.parser.sql.alchemyadapter import getDBTables, setConnection
File "/Library/Python/2.7/site-packages/IMDbPY-5.1-py2.7-macosx-10.12-intel.egg/imdb/parser/sql/alchemyadapter.py", line 54, in <module>
UNICODECOL: UnicodeText,
NameError: name 'UnicodeText' is not defined
我是否指定了“local_infile”错误?