tag:blogger.com,1999:blog-71642015939936654792024-03-14T17:12:20.479+01:00LiPyrary - Python for booksA blog about my daily work with Python and digitized books.Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-7164201593993665479.post-51531597241346467402011-09-12T12:55:00.008+02:002011-09-12T13:39:59.071+02:00Python and Linux kernel 3.0: sys.platform != 'linux2'It's getting more and more challenging to compile Python. Half a year ago Python 2.x's build system broke caused by <a href="http://www.blogger.com/2011/05/how-to-compile-python-on-ubuntu-1104.html">multiarch support in Ubuntu Natty</a>. Now Linux kernel 3.0 is going to reveal yet another issue in Python's configure script.<br /><br />If you compile Python under kernel 3.0, sys.platform changes to 'linux3'. The altered platform string introduces bugs in several libraries and in our softwares stack, too. We and a lot of other people check for Linux with <span style="font-style:italic;">sys.platform == "linux2"</span>.<pre>>>> import sys<br />>>> sys.platform<br />'linux3'</pre>It turns out the 'configure' script causes the problem. It takes the lower case kernel name (uname -s) and first digit of the kernel release (uname -r) to fill the variable <span style="font-style:italic;">MACHDEP</span>. The issue is discussed in <a href="http://bugs.python.org/issue12326">http://bugs.python.org/issue12326</a> to a create length and addressed in upcoming releases 2.7.3, 3.2.2 and 3.3. The 2.7 and 3.2 series announce a 3.0 Linux kernel as linux2 platform. Starting with Python 3.3 sys.platform will be 'linux' for Kernel 2.x and 3.x.<br /><br />However 2.7.3 isn't out yet. Worse older versions of Python are in maintenance mode and will only see security fixes. I'm going to show you, how you can work around the issue.<br /><br /><span style="font-weight:bold;">Change your software</span><br /><br />I recommend that you replace all code like <span style="font-style:italic;">sys.platform == "linux2"</span> with <span style="font-style:italic;">sys.platform.startswith("linux")</span>. It causes the least trouble and is future compatible with Python 3.3 as well.<br /><br /><span style="font-weight:bold;">Alter the configure script</span><br /><br />If you compile your own version of Python on Linux, you can alter the configure script before running it.<br /><br /><pre> case $MACHDEP in<br /> cygwin*) MACHDEP="cygwin";;<br /> darwin*) MACHDEP="darwin";;<br /> atheos*) MACHDEP="atheos";;<br /> irix646) MACHDEP="irix6";;<br /> linux*) MACHDEP="linux2";; # add this line<br /> '') MACHDEP="unknown";;<br /> esac<br /></pre><br /><span style="font-weight:bold;">Run make with MACHDEP=linux2</span><br /><br />I find it easier to run make a different MACHDEP variable. It requires no patching.<br /><pre> ./configure<br /> make MACHDEP=linux2<br /> make altinstall<br /></pre>Good luck!Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com8tag:blogger.com,1999:blog-7164201593993665479.post-58945785654957193332011-05-23T01:24:00.004+02:002011-05-23T01:50:10.424+02:00smc.freeimage 0.0.2A while ago I noticed that Christoph Gohlke has unofficial Windows builds of my smc.freeimage wrapper on his <a href="http://www.lfd.uci.edu/~gohlke/pythonlibs/">hosting site</a>. It made me aware that some people actually use my image processing software.<br /><br />A few minutes ago I synced our internal SVN repository with the <a href="http://developer.berlios.de/projects/smcfreeimage/">project site</a> on berlios.de. The recent version uses Cython 0.14 to wrap most of <a href="http://freeimage.sourceforge.net/">FreeImage 3.15.0</a> and a limited subset of <a href="http://www.littlecms.com/">LCMS 2.1</a>. It can read over 30 image formats including subsets like G3 and G4 compressed TIFFs, which aren't supported by PIL. smc.freeimage also supports limited ICC transformation with embedded or external ICC profiles (for now 24bpp RGB images only) and introspection of ICC profiles. The ICC transformation is optimized with cached transformations and no-copy processing.<br /><br />smc.freeimage wraps only functionality we actually need at <a href="http://www.semantics.de/">work</a>. The core features are heavily tested in production. I estimate that we have processed betwann 300 TB to about half a Petabyte of data, mostly uncompressed TIFF images but also compressed TIFFs, PNGs and JPEGs. <br /><br />If you are interested on adding features or building a more general solution, feel free to contact me.Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com0tag:blogger.com,1999:blog-7164201593993665479.post-17233874849405555072011-05-20T13:02:00.007+02:002011-05-20T13:37:24.300+02:00How to compile Python on Ubuntu 11.04At work we deploy and compile our own Python environment on all server in order to have full control over versions, patches and libraries. Three days ago I stumbled upon a problem in our build process on <font style="font-weight:bold;">Ubuntu Natty</font>. Several modules like <font style="font-weight:bold;">zlib</font> weren't available. It took me a while to figure out the problem. Python's setup.py simply couldn't find <span style="font-family: courier new;">libz.so</span> in it's usual search paths like <span style="font-family: courier new;">/usr/lib</span>.<br /><br />Natty has introduced a new feature called multiarch. Some shared libraries are installed in architecture specific directories, e.g. <span style="font-family: courier new;">/usr/lib/x86_64-linux-gnu/libz.so</span> instead of <span style="font-family: courier new;">/usr/lib/libz.so</span>. The dynamic linker ld.so has been modified to look for libraries in the new locations. If you wonder how, <span style="font-family: courier new;">/etc/ld.so.conf.d/x86_64-linux-gnu.conf</span> does the trick. However Python's setup.py uses hard coded paths and doesn't know about the new feature. Barry's posting [1] has some insight information.<br /><br />The problem has been dealt with for Python 2.7, 3.1, 3.2 and newer versions, but 2.6 and earlier won't see any fixes. Current Python releases (2.7.1, 3.2.0) suffer from the issue, too. Don't be battle-weary! The solution to the issue is rather simple.<br /><pre><br /> $ make distclean<br /> $ export LDFLAGS="-L/usr/lib/$(dpkg-architecture -qDEB_HOST_MULTIARCH)"<br /> $ ./configure<br /> $ make<br /> $ make install<br /> $ unset LDFLAGS<br /></pre><br />This adds an additional library search path <span style="font-family: courier new;">(-L/usr/lib/x86_64-linux-gnu</span> on my box). Now setup.py knows about the multiarch lib directory and builds zlib and all the other missing modules just fine.<br /><br />Actually my build system a bit more paranoid. It also adds <span style="font-family: courier new;">/lib/x86_64-linux-gnu</span> as library search path and<span style="font-family: courier new;"> /usr/include/x86_64-linux-gnu</span> as header include path for C and C++.<br /><pre><br /> $ export arch=$(dpkg-architecture -qDEB_HOST_MULTIARCH)<br /> $ export LDFLAGS="-L/usr/lib/$arch -L/lib/$arch"<br /> $ export CFLAGS="-I/usr/include/$arch"<br /> $ export CPPFLAGS="-I/usr/include/$arch"<br /> $ ./configure<br /> $ make<br /> $ make install<br /> $ unset arch LDFLAGS CFLAGS CPPFLAGS<br /></pre><br />[1] <a href="https://lists.ubuntu.com/archives/ubuntu-devel/2011-April/033049.html">https://lists.ubuntu.com/archives/ubuntu-devel/2011-April/033049.html</a>Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com10tag:blogger.com,1999:blog-7164201593993665479.post-80858428336247806682009-08-20T01:10:00.004+02:002009-08-20T01:42:54.054+02:00Microsoft gives MSDN Premium subscription to PSF membersThis almost went through unnoticed. Steve Holden (Python Software Foundation) has worked out a fantastic deal with Sam Ramji and Tom Hanrahan (both leading Open Source guys at Microsoft). Microsoft has given fourteen MSDN Premium subscriptions to Python core developers and PSF members. I'm one of the lucky few. [1], [2]<br /><br />The premium subscription includes licenses and downloads of almost every Microsoft product from MS-DOS 6.22 to Windows 7, all versions of Visual Studio and many more stuff. This is very useful for Python core developers. Every developer with a subscription can finally set up multiple virtual boxes with 32 and 64bit versions of XP, Vista and Windows 7 to test and debug issues. 64bit versions of Windows were hard and costly to come by.<br /><br />I'll keep my Ubuntu boxes for daily work and I'll still be skeptical about Microsoft's open source politics. However I'm glad that their paradigm towards Open Source is changing into the right direction. Python (more precisely IronPython) is going to become more important to Microsoft. I'll put my subscription into good use.<br /><br />Thanks to Sam, Tom and Steve!<br /><br />[1] http://mail.python.org/pipermail/python-dev/2009-July/090704.html<br />[2] http://mail.python.org/pipermail/python-dev/2009-August/091020.htmlChristian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com1tag:blogger.com,1999:blog-7164201593993665479.post-32820132432868541102009-08-13T00:59:00.003+02:002009-08-13T01:27:42.785+02:00How to add a new module search pathOnce in a while Python users are asking how to add some directories to sys.path permanently. Usually a solution like the PYTHONPATH env variable are suggested to the op. Other solutions require root privileges or modify the search path for all users. <a href="http://python.org/dev/peps/pep-0370/">PEP 370</a> adds another way that is more clean and easy to use. It doesn't require root privileges and it doesn't suffer from other issues. PYTHONPATH causes trouble for multiple Python versions. C extensions only work for one version of Python, most Python modules won't work on Python 2 and 3.<br /><br />My preferred way adds additional search pathes just for <span style="font-weight: bold;">one</span> version of Python and just for <span style="font-weight: bold;">me</span>. It uses a .pth file as explained in the <a href="http://docs.python.org/library/site.html#module-site">site module</a> manual. .pth files only work in site-packages directories, either the global or the user specific directories.<br /><br /><span style="font-size:130%;">The Python way</span><br /><br /><span style="font-family:courier new;">$ python2.6</span><br /><span style="font-family:courier new;">>>> import os</span><br /><span style="font-family:courier new;">>>> import site</span><br /><span style="font-family:courier new;">>>> site.USER_SITE</span><br /><br /><span style="font-family:courier new;">'/home/heimes/.local/lib/python2.6/site-packages'</span><br /><br /><span style="font-style: italic;">Create the directory if it doesn't exist yet</span><br /><br /><span style="font-family:courier new;">>>> if not os.path.isdir(site.USER_SITE):</span><br /><span style="font-family:courier new;">... os.makedirs(site.USER_SITE)</span><br /><span style="font-family:courier new;">...<br /><br /><span style="font-style: italic;">mypath.pth is going to contain my list of addition search path<br /></span></span><span style="font-family:courier new;"><br />>>> mypth = os.path.join(site.USER_SITE, "mypath.pth")</span><br /><span style="font-family:courier new;">>>> path_to_add = ["/home/heimes/modules", "/home/heimes/other_modules"]</span><br /><br /><span style="font-style: italic;">Add a list of search paths line by line, also make sure we end with an empty line</span><br /><span style="font-family:courier new;"><br />>>> with open(mypth, "a") as f:</span><br /><span style="font-family:courier new;">... f.write("\n".join(path_to_add))</span><br /><span style="font-family:courier new;">... f.write("\n")</span><br /><span style="font-family:courier new;">...</span><br /><span style="font-family:courier new;">>>></span><br /><br /><span style="font-size:130%;">The bash way</span><br /><br /><span style="font-family:courier new;">$ python2.6 -m site --user-site</span><br /><span style="font-family:courier new;">/home/heimes/.local/lib/python2.6/site-packages</span><br /><span style="font-family:courier new;">$ mkdir -p $(python2.6 -m site --user-site)</span><br /><span style="font-family:courier new;">$ echo "/home/heimes/more_modules" >> $(python2.6 -m site --user-site)/mypath.pth</span><br /><br /><span style="font-weight: bold;">Let's check if it works</span><br /><br /><span style="font-style: italic;font-family:courier new;" >check the pth file</span><br /><span style="font-family:courier new;"><br /><span style="font-family:courier new;">$ cat $(python2.6 -m site --user-site)/mypath.pth</span></span><br /><span style="font-family:courier new;">/home/heimes/modules</span><br /><span style="font-family:courier new;">/home/heimes/other_modules</span><br /><span style="font-family:courier new;">/home/heimes/more_modules</span><br /><br /><span style="font-style: italic;font-family:georgia;" >Let's see if the modules are in the new search path</span><span style="font-style: italic;font-family:courier new;" ><span style="font-family:georgia;"> ... they aren't because the </span>directories don't exist yet.</span><br /><br /><span style="font-family:courier new;">$ python2.6 -m site</span><br /><span style="font-family:courier new;">sys.path = [</span><br /><span style="font-family:courier new;"> '/home/heimes',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/plat-linux2',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/lib-tk',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/lib-old',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/lib-dynload',</span><br /><span style="font-family:courier new;"> '/home/heimes/.local/lib/python2.6/site-packages',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/PIL',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/gst-0.10',</span><br /><span style="font-family:courier new;"> '/var/lib/python-support/python2.6',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/gtk-2.0',</span><br /><span style="font-family:courier new;"> '/var/lib/python-support/python2.6/gtk-2.0',</span><br /><span style="font-family:courier new;"> '/var/lib/python-support/python2.6/pyinotify',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/wx-2.6-gtk2-unicode',</span><br /><span style="font-family:courier new;"> '/usr/local/lib/python2.6/dist-packages',</span><br /><span style="font-family:courier new;">]</span><br /><span style="font-family:courier new;">USER_BASE: '/home/heimes/.local' (exists)</span><br /><span style="font-family:courier new;">USER_SITE: '/home/heimes/.local/lib/python2.6/site-packages' (exists)</span><br /><span style="font-family:courier new;">ENABLE_USER_SITE: True</span><br /><br /><span style="font-style: italic;font-family:georgia;" >create one example directory</span><br /><span style="font-family:courier new;"><br />$ mkdir /home/heimes/modules</span><br /><span style="font-family:courier new;">$ python2.6 -m site</span><br /><span style="font-family:courier new;">sys.path = [</span><br /><span style="font-family:courier new;"> '/home/heimes',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/plat-linux2',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/lib-tk',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/lib-old',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/lib-dynload',</span><br /><span style="font-family:courier new;"> '/home/heimes/.local/lib/python2.6/site-packages',</span><br /><span style="font-family:courier new;"> </span><span style="font-weight: bold;font-family:courier new;" > '/home/heimes/modules',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/PIL',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/gst-0.10',</span><br /><span style="font-family:courier new;"> '/var/lib/python-support/python2.6',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/gtk-2.0',</span><br /><span style="font-family:courier new;"> '/var/lib/python-support/python2.6/gtk-2.0',</span><br /><span style="font-family:courier new;"> '/var/lib/python-support/python2.6/pyinotify',</span><br /><span style="font-family:courier new;"> '/usr/lib/python2.6/dist-packages/wx-2.6-gtk2-unicode',</span><br /><span style="font-family:courier new;"> '/usr/local/lib/python2.6/dist-packages',</span><br /><span style="font-family:courier new;">]</span><br /><span style="font-family:courier new;">USER_BASE: '/home/heimes/.local' (exists)</span><br /><span style="font-family:courier new;">USER_SITE: '/home/heimes/.local/lib/python2.6/site-packages' (exists)</span><br /><span style="font-family:courier new;">ENABLE_USER_SITE: True</span><br /><br />Easy, isnt' it?<br /></code">Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com3tag:blogger.com,1999:blog-7164201593993665479.post-85438021017351529282009-08-12T12:02:00.003+02:002009-08-12T12:48:26.101+02:00libxml2 crash on 64bit UbuntuI've spent the last couple of hours debugging a really strange segfault. Our application stack had a reproduceable crash in libxml2 -- but only with self compiled versions of libxml2. Ubuntu's 2.6.32 worked like a charm, my self compiled 2.6.32 didn't. The very same version works on several other Debian, Redhat and SuSE boxes, 32 and 64bit, too. WTF!?<br /><br />The crash always occured in <span style="font-family:courier new;">xmlIO.c:__xmlParserInputBufferCreateFilename() </span>with <span style="font-family:courier new;">xmlGzfileOpen()</span> as open handler. After several gdb debugging sessions and several recompiles I noticed a suspicious message in the make output:<br /><br /><span style="font-family:courier new;">xmlIO.c: In function 'xmlGzfileOpen_real':</span><br /><span style="font-family:courier new;">xmlIO.c:1132: warning: implicit declaration of function 'gzopen64'</span><br /><span style="font-family:courier new;">xmlIO.c:1132: warning: nested extern declaration of 'gzopen64'</span><br /><span style="font-family:courier new;">xmlIO.c:1132: warning: assignment makes pointer from integer without a cast </span><br /><span style="font-family:courier new;">xmlIO.c: In function 'xmlGzfileOpenW':</span><br /><span style="font-family:courier new;">xmlIO.c:1200: warning: assignment makes pointer from integer without a cast</span><br /><br />The message only occured during my own compiles but not during "<span style="font-family:courier new;">apt-get source -b libxml2</span>" . Apparently Ubuntu has patched the sources to fix the issue. The changelog contains yet another hint:<br /><br /><span style="font-family:courier new;"> * libxml.h: define _LARGEFILE64_SOURCE to properly get gzopen64 defines in</span><span style="font-family:courier new;"> zlib.h. Closes: #439843. Thanks Dann Frazier.</span><br /><br />That's the solution to my problem! CFLAGS="-D<span style="font-family:courier new;">_LARGEFILE64_SOURCE" ./configure <span style="font-family: georgia;">and both the compiler warning and the crash is gone.</span><br /><br /></span>Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com0tag:blogger.com,1999:blog-7164201593993665479.post-71029405231992990622009-07-30T15:52:00.002+02:002009-07-30T15:55:13.559+02:00multiprocessing 2.6.2.1 releasedA new version of the multiprocessing backport to Python 2.4 and 2.5 has been released. It contains all fixes from the Python 2.6 branch. As usually the release is available as tar.gz and Windows installer for Python 2.4 and 2.5 on <a href="http://pypi.python.org/pypi/multiprocessing/2.6.2.1">PyPI</a>.Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com0tag:blogger.com,1999:blog-7164201593993665479.post-83489373791818666152009-07-03T03:18:00.002+02:002009-07-03T03:23:54.220+02:00Python 3.0 is dead, long lives Python 3.0Now it's official. The developer teams has decided against a Python 3.0.2 bugfix release [1]. Python 3.0 will not see another release and everybody should move to Python 3.1 as soon as possible. The 3.1 release is so much better than 3.0 and the <a href="http://docs.python.org/dev/py3k/whatsnew/3.1.html#porting-to-python-3-1">list of incompatibilities</a> is small. Have fun!<br /><br />[1] Barry Warsaw, <a href="http://mail.python.org/pipermail/python-list/2009-July/718561.html"><span style="font-size:100%;">Python 3.0 (pinin' for the fjords)</span></a>Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com2tag:blogger.com,1999:blog-7164201593993665479.post-21129122094497300642009-04-16T20:11:00.004+02:002009-04-20T13:30:52.514+02:00Ubuntu 9.04 (Jaunty) and PythonLast week I've started with testing Ubuntu 9.04. The update process ran smoothly as usual but I had to switch from fglrx to radeonhd because the proprietary ATI driver seems to lack support for the graphics card of my Lenovo T60p. Well down ATI ...<br /><br /><span style="font-size:130%;">Good news</span><br /><br />Ubuntu 9.04 is shipped with <span style="font-weight: bold;">four</span> versions of Python:<br /><ul><li>2.4.6 (Zope2 still requires Python 2.4)</li><li>2.5.4</li><li>2.6.2 (default)<br /></li><li>3.0.1</li></ul>You can't ask for more Python versions!<br /><br /><span style="font-size:130%;">Bad news</span><br /><br />Ubuntu's Python team hasn't included my multiprocessing backport for Python 2.4 and 2.5 in Jaunty although <a href="http://wiki.debian.org/SandroTosi">Sandro Tosi</a> has created a Debian package. You still have to install it manually from <a href="http://pypi.python.org/">PyPI</a>.<br /><br /><span style="font-weight: bold;">Warning</span><br /><br />Did you install Python 2.6 yourself before? Make sure you remove it ASAP!<br /><br />At first I couldn't figure out why lots of Python based applications were broken. It was a major issue for me because Ubuntu uses Python in lots of places. Then it occured to me that it may be related to my previous installation of Python 2.6. Ubuntu 8.04 didn't use Python 2.6 but 9.04 uses 2.6 as default Python for its apps. And <span style="font-family: courier new;">/usr/local</span> has precedence over <span style="font-family: courier new;">/usr<span style="font-family: georgia;">. Once</span></span> it was gone everything worked again.<br /><br />In order to wipe your own installation of Python from your hard disk you have to remove<br /><ul><li style="font-family: courier new;">/usr/local/bin/py*2.6</li><li style="font-family: courier new;">/usr/local/lib/libpython2.6.so*</li><li><span style="font-family: courier new;">/usr/local/lib/python2.6</span> (except for an empty <span style="font-family: courier new;">site-packages</span> directory)</li></ul>Have fun!Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com1tag:blogger.com,1999:blog-7164201593993665479.post-56524523449568915922009-04-07T22:07:00.001+02:002009-04-09T14:40:34.479+02:00work, booksnakes and a side dish of cherries<div style="text-align: justify;">For more than a year I'm employed by a company called <a href="http://www.semantics.de/"><span style="font-style: italic;">S<e>mantics Kommunikationsmanagement GmbH</e></span></a> as a Python developer. The company offers multiple services related to communcation and its management in our modern world. Most of the time I'm working on the server part of a software stack called <a href="http://www.semantics.de/produkte/visual_library/">Visual Library</a>.<br /><br />The rest of the time I'm allowed to spend on Open Source projects -- <span style="font-style: italic;">up to a quarter of my work hours</span>! In the past weeks I've spent the contingent of open source time on several Python packages I've developed for my employer so far. In the next couple of weeks I will release several projects as Open Source on <a href="http://pypi.python.org/">http://pypi.python.org</a>/.<br /><br />Before I start blogging about my work I like to give you an impression what the work is all about. I hope it doesn't sound too much like an advertisment of the software and for my employer.<br /><br /><span style="font-size:130%;">Abstract</span><br /><br />The Visual Library software stack is used for the digitalization process of bibliographic entities. Bibliographic entities is a technical term that includes a variety of things, including but not limited to books, maps, magazines, news papierrs, photographies, letters, records, charters and many more. The software aids libraries in modeling the entire process. It starts with importing catalogs and metadata, assembling work batches, assigning books to scanners, importing images, quality assurance, text recognition ... The process is much, much more than simply uploading a bunch of images. Really!<br /><br />Metadata and open interfaces are very important in the world of libraries. Therefore the VLS provides various standarized interfaces and data exchange formats like METS, MODS, SRU, OAI, Epicur, MarcXML, Dublin Core and URN ( just to name a few). We also heavily rely upon XML, open standards and open file formats to guarantee that the data can be read in fifty years or more from now.<br /><br />Fifty years don't sound much when one deals with 500 year old books. But can you still open the images you have created on your C64 and stored on a 5 1/4 inch floppy disk twenty years ago? What about your ATARI's datasette tapes? Even NASA has issues reading their old tapes because hardware is missing or the file format is undocumented ...<br /><br />Our software is used in multiple installations across Germany and German speaking countries. The largest installation hosts about 9 TB of raw image data for more than half a million pages of more than ten thousand bibliographic entities from the 17th century. The material is from the 16th to 21th century with a focus on old entities. We have mostly German material written in German but also Latin, Greek, Hebrew, French and other languages. Two important projects are about Judaica (I wasn't able to find a correct translation, it roughly translates to Jewish material). The bibliographic entities orginaties from public libraries, usually from an university environment.<br /><br /><span style="font-size:130%;">Visual Library Server</span><br /><br />The heart of the Visual Library software stack is a Python driven web application server. It's built on top of <a href="http://www.cherrypy.org/">CherryPy</a> framework and driven by a <a href="http://www.firebirdsql.org/">Firebird</a> database. The server utilizes a cornucopia of open source third party packages as well as commercial and proprietary software. The most noticable Python packages are <a href="http://codespeak.net/lxml/">lxml</a> for XML and XSL(T), <a href="http://www.reportlab.org/">reportlab</a> for PDF creation, <a href="http://www.cython.org/">Cython</a> for optimization / library bindings and <a href="http://lucene.apache.org/pylucene/">PyLucene</a> for full text search.<br /><br />The software is yet another example for the power of Python. We wouldn't have been able to build such a large and complex system without Python. I like to thank the community for all the hard work and feature rich extensions, too.<br /><br /><span style="font-size:130%;">Examples</span><br /><br />Are you interested in more? Have a look ...<br /></div><ul style="text-align: justify;"><li><a href="http://www.dilibri.de/">Dilibri (Rhineland-Palatinate)</a><br /></li><li><a href="http://www.judaica-frankfurt.de/">Judaica Collection Frankfurt</a></li><li><a href="http://digitale.bibliothek.uni-halle.de/">Bibliotheca Ponickaviana</a></li></ul><div style="text-align: justify;"><br /></div><div style="text-align: center;"><a href="http://www.dilibri.de/rlb/content/titleinfo/74690">Churfürstl. Sächsisches Schreiben an dero Abgesandten in Nürenberg, 1649</a><br /></div><div style="text-align: justify;"><br /></div><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://dilibri.de/download/webcache/300/73515"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 300px; height: 387px;" src="http://dilibri.de/download/webcache/300/73515" alt="" border="0" /></a><br /><div style="text-align: center;"><a href="http://www.dilibri.de/rlb/content/titleinfo/129018">Die Herzogthumer Iulich, Cleve, und Berg samt der Grafschafft Marck, und angrænzenten Herrschafften, about 1720</a></div><div style="text-align: justify;"><br /></div><div style="text-align: center;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://dilibri.de/download/webcache/300/129021"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 300px; height: 327px;" src="http://dilibri.de/download/webcache/300/129021" alt="" border="0" /></a>[<a href="http://dilibri.de/rlb/content/zoom/129021">zoom view of this map</a>]<br /></div><div style="text-align: justify;"><br /></div>Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com0tag:blogger.com,1999:blog-7164201593993665479.post-33562673334475656662009-04-02T03:52:00.005+02:002009-04-02T04:17:17.503+02:00autoconf'ing multiprocessing<span style="font-family:times new roman;">A few weeks ago Deepak Rokade</span><b style="font-family: times new roman;"></b><span style="font-family:times new roman;"> has reported an</span><a style="font-family: times new roman;" href="http://mail.python.org/pipermail/python-list/2009-March/705639.html"> issue</a><span style="font-family:times new roman;"> with the processing package on Solaris. The posting has caught my interest since processing is the ancestor of the multiprocessing package. I'm the current maintainer of the </span><a style="font-family: times new roman;" href="http://code.google.com/p/python-multiprocessing">backport</a><span style="font-family:times new roman;"> to Python 2.4 and 2.5. I started a discussion about the problem on the </span><a style="font-family: times new roman;" href="http://mail.python.org/pipermail/python-dev/2009-March/087367.html">python-dev</a> mailing list. No immediate solution was found but we decided to move from hard coded configuration to an autoconf approach.<br /><blockquote style="font-family: times new roman;"><pre wrap="">"I guess multiprocessing doesn't use autoconf tests for historical reasons. It's ancestor -- the pyprocessing package -- was using hard coded values, too." [quoting myself]</pre></blockquote>I did some experiments with autoconf but I had no luck in getting it right on BSDish platforms. I ran out of free time, too. <a href="http://jessenoller.com/">Jesse</a> was going to work on the matter at PyCon anyway so I stopped pursuing the failing tests. It's no fun debugging these kind of problems through the build bots. I'm looking forward in getting access to the <a href="http://www.snakebite.org/">snakebit</a> network. It will make our work much easier.<br /><br />Anyway, it turned out that platforms FreeBSD and Darwin are having known bugs like a broken <span style="font-family:courier new;">sem_getvalue() function.</span> Jesse and Martin von Löwis have finished their <a href="http://svn.python.org/view?view=rev&revision=71009">combined work</a>. I'll release a new version of the multiprocessing backport when the tests have passed on all releveant build bots.<br /><br />Good work, Jesse and Martin!Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com0tag:blogger.com,1999:blog-7164201593993665479.post-67852979779982079012009-03-30T11:00:00.002+02:002009-03-30T11:56:38.023+02:00I blog therefore I am (online)Yesterday I decided to start my own blog. Why? Well, for starters blogging is no longer a hype but a well established way to tell people about interesting stuff. I'm not the kind of person that used to follow hypes.<br /><br />I've been working on lots of cool stuff related to <a href="http://www.python.org/">Python </a>and books over the past year. I'm planing to use this blog as channel to tell you about various and manifold things related to my work at <a href="http://www.semantics.de/">s<<e>e>mantics</e></a> (my employer) as well as my doings in and for the Python community.<br /><br />My blog will focus on Python and my work on library software. Hence the prevailing word game on <span style="font-style: italic;">Py</span> in the title of my blog.<br /><span style="text-decoration: underline;"><br /></span>And now for something completely different ...<br /><a href="http://en.wikipedia.org/wiki/Cogito_ergo_sum"></a><blockquote></blockquote>Christian Heimeshttp://www.blogger.com/profile/16043034511693193747noreply@blogger.com0