Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
940 views
in Technique[技术] by (71.8m points)

get the list of metadata associated to a file using python in Ubuntu

I'm trying to get the list of meta-data associated to a file, using python in Ubuntu.

Without using python, the command "extract" works very well but I don't know how to use it with python, I always get a message saying that "extract" is not defined.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I assume you're asking about the metadata that appears in the Windows "Properties" dialogue under the "Summary" tab. (If not, just disregard this.) Here's how I managed it.

  1. Download and install Python win32 extensions. This will put win32, win32com, etc. into your Python[ver]/Lib/site-packages folder. These bring the win32api, win32com, etc. For some reason, I couldn't get the version for Python 2.6 (in build 216) to work. I updated my system to Python 2.7 and used the 216 build for Python 2.7, and it worked. (To download & install, follow the link above, click the link reading 'pywin32', click the link for the latest build (currently 216), click the link for the .exe file that matches your system and Python installation (for me, it was pywin32-216.win32-py2.7.exe). Run the .exe file.)
  2. Copy and paste the code from the "Get document summary information" page on Tim Golden's tutorial into a .py file on your own computer.
  3. Tweak the code. You don't really have to tweak the code, but if you run this Tim's script as your main module, and if you don't supply a pathname as your first sys.argv, then you'll get an error. To make the tweak, scroll down to the bottom of the code, and omit the final block, which starts with if __name__ == '__main__':.

Save your file as something like property_reader.py, and call its property_sets(filepath) method. This method returns a generator object. You can iterate through the generator to see all the properties and their values. You could implement it like this:

# Assuming 'property_reader.py' is the name of the module/file in which you saved Tim Golden's code...
import property_reader 
propgenerator = property_reader.property_sets('[your file path]')
    for name, properties in propgenerator:
        print name
        for k, v in properties.items ():
            print "  ", k, "=>", v

The output of the above code will be something like the following:

DocSummaryInformation
   PIDDSI_CATEGORY => qux
SummaryInformation
   PIDSI_TITLE => foo
   PIDSI_COMMENTS => flam
   PIDSI_AUTHOR => baz
   PIDSI_KEYWORDS => flim
   PIDSI_SUBJECT => bar

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...