[Checkins] SVN: z3c.filetype/trunk/ - added interfaces for Microsoft office files

Juergen Kartnaller juergen at kartnaller.at
Mon Jan 19 07:48:19 EST 2009


Log message for revision 94830:
   - added interfaces for Microsoft office files
     Note : with the current magic.mimes file it is not possible to reliably
            detect Microsoft Office files. All Office files are detected as
            application/msword. The only way for now is to use the filename to
            detect the type. (see README.txt)
  

Changed:
  U   z3c.filetype/trunk/CHANGES.txt
  U   z3c.filetype/trunk/setup.py
  U   z3c.filetype/trunk/src/z3c/filetype/README.txt
  U   z3c.filetype/trunk/src/z3c/filetype/api.py
  U   z3c.filetype/trunk/src/z3c/filetype/interfaces/filetypes.py
  U   z3c.filetype/trunk/src/z3c/filetype/magic.txt
  A   z3c.filetype/trunk/src/z3c/filetype/testdata/excel.xls
  A   z3c.filetype/trunk/src/z3c/filetype/testdata/noface.bmp
  A   z3c.filetype/trunk/src/z3c/filetype/testdata/portable.pdf
  A   z3c.filetype/trunk/src/z3c/filetype/testdata/powerpoingt.ppt
  A   z3c.filetype/trunk/src/z3c/filetype/testdata/word.doc

-=-
Modified: z3c.filetype/trunk/CHANGES.txt
===================================================================
--- z3c.filetype/trunk/CHANGES.txt	2009-01-19 12:42:04 UTC (rev 94829)
+++ z3c.filetype/trunk/CHANGES.txt	2009-01-19 12:48:18 UTC (rev 94830)
@@ -5,6 +5,21 @@
 After
 =====
 
+2009/01/19 1.2.0
+================
+
+ - added interfaces for Microsoft office files
+   Note : with the current magic.mimes file it is not possible to reliably
+          detect Microsoft Office files. All Office files are detected as
+          application/msword. The only way for now is to use the filename to
+          detect the type. (see README.txt)
+
+2007/12/21 1.1.1
+================
+
+ - added an interface for BMP image files
+ - make sure unknown formats are not recognized as a default format
+
  - Because tests failed and the author of the changes did not fixed them after
    a year I reverted trunk to version 82160.
 

Modified: z3c.filetype/trunk/setup.py
===================================================================
--- z3c.filetype/trunk/setup.py	2009-01-19 12:42:04 UTC (rev 94829)
+++ z3c.filetype/trunk/setup.py	2009-01-19 12:48:18 UTC (rev 94830)
@@ -20,7 +20,7 @@
 
 setup(
     name="z3c.filetype",
-    version="1.1.0",
+    version="1.2.0",
     namespace_packages=["z3c"],
     packages=find_packages("src"),
     package_dir={"": "src"},

Modified: z3c.filetype/trunk/src/z3c/filetype/README.txt
===================================================================
--- z3c.filetype/trunk/src/z3c/filetype/README.txt	2009-01-19 12:42:04 UTC (rev 94829)
+++ z3c.filetype/trunk/src/z3c/filetype/README.txt	2009-01-19 12:48:18 UTC (rev 94830)
@@ -19,13 +19,15 @@
   >>> for name in fileNames:
   ...     if name==".svn": continue
   ...     path = os.path.join(testData, name)
-  ...     i =  api.getInterfacesFor(file(path, 'rb'))
+  ...     i =  api.getInterfacesFor(file(path, 'rb'), filename=name)
   ...     print name
   ...     print sorted(i)
   DS_Store
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IBinaryFile>]
   IMG_0504.JPG
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IJPGFile>]
+  excel.xls
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
   faces_gray.avi
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IAVIFile>]
   ftyp.mov
@@ -40,6 +42,12 @@
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IBZIP2File>]
   mpeglayer3.mp3
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IAudioMPEGFile>]
+  noface.bmp
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IBMPFile>]
+  portable.pdf
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IPDFFile>]
+  powerpoingt.ppt
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
   test.flv
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IFLVFile>]
   test.gnutar
@@ -60,7 +68,30 @@
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IHTMLFile>]
   thumbnailImage_small.jpeg
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IJPGFile>]
+  word.doc
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
 
+It is not possible to reliably detect Microsoft Office files from file data.
+The only way right now is to use the filename.
+
+  >>> for name in fileNames:
+  ...     if name==".svn": continue
+  ...     i =  api.getInterfacesFor(filename=name)
+  ...     print name
+  ...     print sorted(i)
+  DS_Store
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IBinaryFile>]
+  ...
+  excel.xls
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSExcelFile>]
+  ...
+  powerpoingt.ppt
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSPowerpointFile>]
+  ...
+  word.doc
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
+
+
 The filename is only used if no interface is found, because we should
 not trust the filename in most cases.
 
@@ -179,6 +210,7 @@
   ...     print name + " --> " + interfaces.IFileType(i).contentType
   DS_Store --> application/octet-stream
   IMG_0504.JPG --> image/jpeg
+  excel.xls --> application/msword
   faces_gray.avi --> video/x-msvideo
   ftyp.mov --> video/quicktime
   ipod.mp4 --> video/mp4
@@ -186,6 +218,9 @@
   logo.gif --> image/gif
   logo.gif.bz2 --> application/x-bzip2
   mpeglayer3.mp3 --> audio/mpeg
+  noface.bmp --> image/bmp
+  portable.pdf --> application/pdf
+  powerpoingt.ppt --> application/msword
   test.flv --> video/x-flv
   test.gnutar --> application/x-tar
   test.html --> text/html
@@ -196,8 +231,10 @@
   test2.html --> text/html
   test2.thml --> text/html
   thumbnailImage_small.jpeg --> image/jpeg
+  word.doc --> application/msword
 
 
+
 Size adapters
 =============
 

Modified: z3c.filetype/trunk/src/z3c/filetype/api.py
===================================================================
--- z3c.filetype/trunk/src/z3c/filetype/api.py	2009-01-19 12:42:04 UTC (rev 94829)
+++ z3c.filetype/trunk/src/z3c/filetype/api.py	2009-01-19 12:48:18 UTC (rev 94830)
@@ -13,7 +13,7 @@
 def byMimeType(t):
 
     """returns interfaces implemented by mimeType"""
-    
+
     ifaces = [iface for name, iface in vars(filetypes).items() \
               if name.startswith("I")]
     res = InterfaceSet()
@@ -30,7 +30,7 @@
     objects (file argument) with an optional filename as name or
     mimeType as mime-type
     """
-    
+
     ifaces = set()
     if file is not None:
         types = magicFile.detect(file)

Modified: z3c.filetype/trunk/src/z3c/filetype/interfaces/filetypes.py
===================================================================
--- z3c.filetype/trunk/src/z3c/filetype/interfaces/filetypes.py	2009-01-19 12:42:04 UTC (rev 94829)
+++ z3c.filetype/trunk/src/z3c/filetype/interfaces/filetypes.py	2009-01-19 12:48:18 UTC (rev 94830)
@@ -40,15 +40,19 @@
 ITextFile.setTaggedValue(MTM,re.compile('^text/.+$'))
 ITextFile.setTaggedValue(MT,'text/plain')
 
-class IImageFile(ITypedFile):
-    """image files"""
-IImageFile.setTaggedValue(MTM,re.compile('^image/.+$'))
+class IImageFile(interface.Interface):
+    """marker for image files"""
 
 class IPDFFile(IBinaryFile):
     """pdf files"""
 IPDFFile.setTaggedValue(MTM,re.compile('application/pdf'))
 IPDFFile.setTaggedValue(MT,'application/pdf')
 
+class IBMPFile(IImageFile, IBinaryFile):
+    """jpeg file"""
+IBMPFile.setTaggedValue(MTM,re.compile('image/bmp'))
+IBMPFile.setTaggedValue(MT,'image/bmp')
+
 class IJPGFile(IImageFile, IBinaryFile):
     """jpeg file"""
 IJPGFile.setTaggedValue(MTM,re.compile('image/jpe?g'))
@@ -64,21 +68,20 @@
 IGIFFile.setTaggedValue(MTM,re.compile('image/gif'))
 IGIFFile.setTaggedValue(MT,'image/gif')
 
-class IVideoFile(IBinaryFile):
-    """video file"""
-IVideoFile.setTaggedValue(MTM,re.compile('^video/.+$'))
+class IVideoFile(interface.Interface):
+    """marker for video file"""
 
-class IQuickTimeFile(IVideoFile):
+class IQuickTimeFile(IVideoFile, IBinaryFile):
     """Quicktime Video File Format"""
 IQuickTimeFile.setTaggedValue(MTM,re.compile('video/quicktime'))
 IQuickTimeFile.setTaggedValue(MT,'video/quicktime')
 
-class IAVIFile(IVideoFile):
+class IAVIFile(IVideoFile, IBinaryFile):
     """Quicktime Video File Format"""
 IAVIFile.setTaggedValue(MTM,re.compile('video/x-msvideo'))
 IAVIFile.setTaggedValue(MT,'video/x-msvideo')
 
-class IMPEGFile(IVideoFile):
+class IMPEGFile(IVideoFile, IBinaryFile):
     """MPEG Video File Format"""
 IMPEGFile.setTaggedValue(MTM,re.compile('video/mpe?g'))
 IMPEGFile.setTaggedValue(MT,'video/mpeg')
@@ -88,21 +91,20 @@
 IMP4File.setTaggedValue(MTM,re.compile('video/mp4'))
 IMP4File.setTaggedValue(MT,'video/mp4')
 
-class IFLVFile(IVideoFile):
+class IFLVFile(IVideoFile, IBinaryFile):
     """Macromedia Flash FLV Video File Format"""
 IFLVFile.setTaggedValue(MTM,re.compile('video/x-flv'))
 IFLVFile.setTaggedValue(MT,'video/x-flv')
 
-class IASFFile(IVideoFile):
+class IASFFile(IVideoFile, IBinaryFile):
     """Windows Media File Format"""
 IASFFile.setTaggedValue(MTM,re.compile('video/x-ms-asf'))
 IASFFile.setTaggedValue(MT,'video/x-ms-asf')
 
-class IAudioFile(ITypedFile):
+class IAudioFile(interface.Interface):
     """audio file"""
-IAudioFile.setTaggedValue(MTM,re.compile('^audio/.+$'))
 
-class IAudioMPEGFile(IAudioFile):
+class IAudioMPEGFile(IAudioFile, IBinaryFile):
     """audio file"""
 IAudioMPEGFile.setTaggedValue(MTM,re.compile('audio/mpeg'))
 IAudioMPEGFile.setTaggedValue(MT,'audio/mpeg')
@@ -116,3 +118,22 @@
     """XML File"""
 IXMLFile.setTaggedValue(MTM,re.compile('text/xml'))
 IXMLFile.setTaggedValue(MT,'text/xml')
+
+class IMSOfficeFile(IBinaryFile):
+    """Microsoft Office File"""
+
+class IMSWordFile(IMSOfficeFile):
+    """Microsoft Word File"""
+IMSWordFile.setTaggedValue(MTM,re.compile('application/.*msword'))
+IMSWordFile.setTaggedValue(MT,'application/msword')
+
+class IMSExcelFile(IMSOfficeFile):
+    """Microsoft Excel File"""
+IMSExcelFile.setTaggedValue(MTM,re.compile('application/.*excel'))
+IMSExcelFile.setTaggedValue(MT,'application/msexcel')
+
+class IMSPowerpointFile(IMSOfficeFile):
+    """Microsoft Powerpoint File"""
+IMSPowerpointFile.setTaggedValue(MTM,re.compile('application/.*powerpoint'))
+IMSPowerpointFile.setTaggedValue(MT,'application/mspowerpoint')
+

Modified: z3c.filetype/trunk/src/z3c/filetype/magic.txt
===================================================================
--- z3c.filetype/trunk/src/z3c/filetype/magic.txt	2009-01-19 12:42:04 UTC (rev 94829)
+++ z3c.filetype/trunk/src/z3c/filetype/magic.txt	2009-01-19 12:48:18 UTC (rev 94830)
@@ -15,6 +15,7 @@
   ...     print '%s --> %r' % (name,sorted(m.detect(file(path))))
   DS_Store --> []
   IMG_0504.JPG --> ['image/jpeg']
+  excel.xls --> ['application/msword']
   faces_gray.avi --> ['video/x-msvideo']
   ftyp.mov --> ['video/quicktime']
   ipod.mp4 --> ['video/mp4', 'video/quicktime']
@@ -22,6 +23,9 @@
   logo.gif --> ['image/gif']
   logo.gif.bz2 --> ['application/x-bzip2']
   mpeglayer3.mp3 --> ['audio/mpeg']
+  noface.bmp --> ['image/bmp']
+  portable.pdf --> ['application/pdf']
+  powerpoingt.ppt --> ['application/msword']
   test.flv --> ['video/x-flv']
   test.gnutar --> ['application/x-tar']
   test.html --> ['text/html']
@@ -32,3 +36,5 @@
   test2.html --> ['text/html']
   test2.thml --> ['text/html']
   thumbnailImage_small.jpeg --> ['image/jpeg']
+  word.doc --> ['application/msword']
+

Copied: z3c.filetype/trunk/src/z3c/filetype/testdata/excel.xls (from rev 94829, z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/excel.xls)
===================================================================
(Binary files differ)

Copied: z3c.filetype/trunk/src/z3c/filetype/testdata/noface.bmp (from rev 94829, z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/noface.bmp)
===================================================================
(Binary files differ)

Copied: z3c.filetype/trunk/src/z3c/filetype/testdata/portable.pdf (from rev 94829, z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/portable.pdf)
===================================================================
(Binary files differ)

Copied: z3c.filetype/trunk/src/z3c/filetype/testdata/powerpoingt.ppt (from rev 94829, z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/powerpoingt.ppt)
===================================================================
(Binary files differ)

Copied: z3c.filetype/trunk/src/z3c/filetype/testdata/word.doc (from rev 94829, z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/word.doc)
===================================================================
(Binary files differ)



More information about the Checkins mailing list