Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others


0 votes
in Technique[技术] by (71.8m points)

python - imdbpy - how can i take imdb files and put them in sql database to query movies by genre and year

i'm using imdbpy package to retrieve the data of the IMDb movie database. I want to take the plain text data files distributed by IMDb and put them into a SQL database so i could query movies based on genre and year but i just can't do it.

Here is my code:

    from flask import Flask, render_template, request
    from flask_sqlalchemy import SQLAlchemy
    import os
    from imdb import IMDb
    app = Flask(__name__)
    basedir = os.path.abspath(os.path.dirname(__file__))
    app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///" + os.path.join(basedir, "data.sqlite")
    app.config["SECRET_KEY"] = "secretkey"
    db = SQLAlchemy(app)
    instance = IMDb()
    def home():
        return render_template("home.html")
    @app.route("/movies", methods=["GET", "POST"])
    def movies():
        if request.method == "POST":
            search = request.form.get("name")
            movie = instance.search_movie(str(search))
            movie_three = []
            for i in range(len(movie)):
                id_number = movie[i].movieID
                movie_two = instance.get_movie(id_number)
            return render_template("movies.html", movie=movie, movie_three=movie_three)
            return render_template("home.html")
    if __name__ == "__main__":

I tried to follow this instructions but with no success.

IMDb distributes some of its data as downloadable datasets. IMDbPY can import this data into a database and make it accessible through its API. For this, you will first need to install SQLAlchemy and the libraries that are needed for the database server you want to use. Check out the SQLAlchemy dialects documentation for more detail. Then, follow these steps:

  1. Download the files from the following address and put all of them in the same directory: https://datasets.imdbws.com/
  2. Create a database. Use a collation like utf8_unicode_ci.
  3. Import the data using the s32imdbpy.py script: s32imdbpy.py /path/to/the/tsv.gz/files/ URI URI is the identifier used to access the SQL database. For example: s32imdbpy.py ~/Download/imdb-s3-dataset-2018-02-07/ postgres://user:[email protected]/imdb Please notice that for some database engines (like MySQL and MariaDB) you may need to specify the charset on the URI and sometimes also the dialect, with something like
mysql+mysqldb://username:[email protected]/imdb?charset=utf8

Once the import is finished - which should take about an hour or less on a modern system - you will have a SQL database with all the information and you can use the normal IMDbPY API:

     from imdb import IMDb
     ia = IMDb('s3', 'postgres://user:[email protected]/imdb')
     results = ia.search_movie('the matrix')

Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share