[Checkins] SVN: z3c.filetype/branches/1.1.0/ - added interfaces for Microsoft office files

Juergen Kartnaller juergen at kartnaller.at
Mon Jan 19 07:30:04 EST 2009


Log message for revision 94826:
   - added interfaces for Microsoft office files
     Note : with the current magic.mimes file it is not possible to reliably
            detect Microsoft Office files. All Office files are detected as
            application/msword. The only way for now is to use the filename to
            detect the type. (see README.txt)
  

Changed:
  U   z3c.filetype/branches/1.1.0/CHANGES.txt
  U   z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt
  U   z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py
  U   z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py
  U   z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt
  A   z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/excel.xls
  A   z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/portable.pdf
  A   z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/powerpoingt.ppt
  A   z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/word.doc

-=-
Modified: z3c.filetype/branches/1.1.0/CHANGES.txt
===================================================================
--- z3c.filetype/branches/1.1.0/CHANGES.txt	2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/CHANGES.txt	2009-01-19 12:30:04 UTC (rev 94826)
@@ -5,6 +5,12 @@
 After
 =====
 
+ - added interfaces for Microsoft office files
+   Note : with the current magic.mimes file it is not possible to reliably
+          detect Microsoft Office files. All Office files are detected as
+          application/msword. The only way for now is to use the filename to
+          detect the type. (see README.txt)
+
 2007/12/21 1.1.1
 ================
 

Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt	2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/README.txt	2009-01-19 12:30:04 UTC (rev 94826)
@@ -19,13 +19,15 @@
   >>> for name in fileNames:
   ...     if name==".svn": continue
   ...     path = os.path.join(testData, name)
-  ...     i =  api.getInterfacesFor(file(path, 'rb'))
+  ...     i =  api.getInterfacesFor(file(path, 'rb'), filename=name)
   ...     print name
   ...     print sorted(i)
   DS_Store
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IBinaryFile>]
   IMG_0504.JPG
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IJPGFile>]
+  excel.xls
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
   faces_gray.avi
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IAVIFile>]
   ftyp.mov
@@ -42,6 +44,10 @@
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IAudioMPEGFile>]
   noface.bmp
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IBMPFile>]
+  portable.pdf
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IPDFFile>]
+  powerpoingt.ppt
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
   test.flv
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IFLVFile>]
   test.gnutar
@@ -62,7 +68,30 @@
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IHTMLFile>]
   thumbnailImage_small.jpeg
   [<InterfaceClass z3c.filetype.interfaces.filetypes.IJPGFile>]
+  word.doc
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
 
+It is not possible to reliably detect Microsoft Office files from file data.
+The only way right now is to use the filename.
+
+  >>> for name in fileNames:
+  ...     if name==".svn": continue
+  ...     i =  api.getInterfacesFor(filename=name)
+  ...     print name
+  ...     print sorted(i)
+  DS_Store
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IBinaryFile>]
+  ...
+  excel.xls
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSExcelFile>]
+  ...
+  powerpoingt.ppt
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSPowerpointFile>]
+  ...
+  word.doc
+  [<InterfaceClass z3c.filetype.interfaces.filetypes.IMSWordFile>]
+
+
 The filename is only used if no interface is found, because we should
 not trust the filename in most cases.
 

Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py	2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/api.py	2009-01-19 12:30:04 UTC (rev 94826)
@@ -13,7 +13,7 @@
 def byMimeType(t):
 
     """returns interfaces implemented by mimeType"""
-    
+
     ifaces = [iface for name, iface in vars(filetypes).items() \
               if name.startswith("I")]
     res = InterfaceSet()
@@ -30,7 +30,7 @@
     objects (file argument) with an optional filename as name or
     mimeType as mime-type
     """
-    
+
     ifaces = set()
     if file is not None:
         types = magicFile.detect(file)

Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py	2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/interfaces/filetypes.py	2009-01-19 12:30:04 UTC (rev 94826)
@@ -118,3 +118,22 @@
     """XML File"""
 IXMLFile.setTaggedValue(MTM,re.compile('text/xml'))
 IXMLFile.setTaggedValue(MT,'text/xml')
+
+class IMSOfficeFile(IBinaryFile):
+    """Microsoft Office File"""
+
+class IMSWordFile(IMSOfficeFile):
+    """Microsoft Word File"""
+IMSWordFile.setTaggedValue(MTM,re.compile('application/.*msword'))
+IMSWordFile.setTaggedValue(MT,'application/msword')
+
+class IMSExcelFile(IMSOfficeFile):
+    """Microsoft Excel File"""
+IMSExcelFile.setTaggedValue(MTM,re.compile('application/.*excel'))
+IMSExcelFile.setTaggedValue(MT,'application/msexcel')
+
+class IMSPowerpointFile(IMSOfficeFile):
+    """Microsoft Powerpoint File"""
+IMSPowerpointFile.setTaggedValue(MTM,re.compile('application/.*powerpoint'))
+IMSPowerpointFile.setTaggedValue(MT,'application/mspowerpoint')
+

Modified: z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt
===================================================================
--- z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt	2009-01-19 07:07:37 UTC (rev 94825)
+++ z3c.filetype/branches/1.1.0/src/z3c/filetype/magic.txt	2009-01-19 12:30:04 UTC (rev 94826)
@@ -15,6 +15,7 @@
   ...     print '%s --> %r' % (name,sorted(m.detect(file(path))))
   DS_Store --> []
   IMG_0504.JPG --> ['image/jpeg']
+  excel.xls --> ['application/msword']
   faces_gray.avi --> ['video/x-msvideo']
   ftyp.mov --> ['video/quicktime']
   ipod.mp4 --> ['video/mp4', 'video/quicktime']
@@ -23,6 +24,8 @@
   logo.gif.bz2 --> ['application/x-bzip2']
   mpeglayer3.mp3 --> ['audio/mpeg']
   noface.bmp --> ['image/bmp']
+  portable.pdf --> ['application/pdf']
+  powerpoingt.ppt --> ['application/msword']
   test.flv --> ['video/x-flv']
   test.gnutar --> ['application/x-tar']
   test.html --> ['text/html']
@@ -33,3 +36,5 @@
   test2.html --> ['text/html']
   test2.thml --> ['text/html']
   thumbnailImage_small.jpeg --> ['image/jpeg']
+  word.doc --> ['application/msword']
+

Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/excel.xls
===================================================================
(Binary files differ)


Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/excel.xls
___________________________________________________________________
Added: svn:mime-type
   + application/octet-stream

Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/portable.pdf
===================================================================
(Binary files differ)


Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/portable.pdf
___________________________________________________________________
Added: svn:mime-type
   + application/octet-stream

Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/powerpoingt.ppt
===================================================================
(Binary files differ)


Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/powerpoingt.ppt
___________________________________________________________________
Added: svn:mime-type
   + application/octet-stream

Added: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/word.doc
===================================================================
(Binary files differ)


Property changes on: z3c.filetype/branches/1.1.0/src/z3c/filetype/testdata/word.doc
___________________________________________________________________
Added: svn:mime-type
   + application/octet-stream



More information about the Checkins mailing list