Home | About | Sematext search-lucene.com search-hadoop.com
 Search Lucene and all its subprojects:

Switch to Threaded View
PyLucene, mail # dev - Changes to enable easy_install of packages using JCC


Copy link to this message
-
Re: Changes to enable easy_install of packages using JCC
Andi Vajda 2012-02-01, 23:59

  Hi Chris,

On Wed, 1 Feb 2012, Chris Wilson wrote:

> Thank you for your quick and positive reply :)
>
> On Wed, 1 Feb 2012, Andi Vajda wrote:
>
>>>  I have been working on integrating Apache Tika (in Java) with our open
>>>  source intranet application (in Python/Django) using JCC...
>>
>> Using Maven there helped considerably with getting all the pieces on the
>> Java side.
>
> Although I used maven for an initial compile of Tika, I realised that it
> would work just as well if I downloaded pre-built jar files, which I did from
> http://repo1.maven.org/maven2/org/apache/tika/.
>
>> Your remark about not needing JCC's shared library mode is probably correct
>> right now but as soon as anyone brings in another JCC-built library into
>> the same process as yours, shared mode is going to be required since the
>> Java VM can only be initialized once per process.
>
> I understand that, but I'm prepared to live with that limitation for now, as
> this is likely to be the only Java library that I integrate into this
> Python/Django application. I tried hard to find pure Python solutions, but
> Tika is simply miles ahead of the competition.
>
>> No objections to these patches in principle but it would be easier for me
>> to integrate them if you could provide patches computed from the svn
>> repository of JCC:
>> http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/ Your patches
>> seem to be small enough so I should be able to do without but it would be
>> nicer if I didn't have to guess...
>
> I think the patch that I attached was already based on trunk. The git
> repository includes the .svn directories, points to trunk, and I generated
> the patch using "svn diff".

Sorry, I missed that you indeed had attached a patch last time.
(to be continued...)

>> Also, please write small descriptions for these new command line flags to
>> go into JCC's __main__.py file:
>> http://svn.apache.org/repos/asf/lucene/pylucene/trunk/jcc/jcc/__main__.py
>
> Done, new patch attached.

Thank you !

>> This mess of setuptools patching was meant to be *temporary* until
>> setuptools' issue 43 was fixed. As you can see, I filed this bug 3 1/2
>> years ago, http://bugs.python.org/setuptools/issue43, and my patch for
>> issue 43 still hasn't been accepted, rejected, integrated, anything'ed...
>> Dormant. For over three years.
>
> Sorry about that. I've had similar experience with bugs reported against
> ubuntu, hibernate, rails... :(
>
>>>  * Why does JCC use non-standard command line arguments like --build and
>>>  --install? Can it be modified to make it easier to invoke from a
>>>  setup.py-style environment, such as exporting a setup() function as
>>>  setuptools does?
>>
>> What standard are you referring to ?
>> The python extension module build/install/deploy story on Python keeps
>> evolving... Add Python 3.x support into the mix, and the mess is complete.
>>
>> Seriously, though, I think that the right thing to do to better integrate
>> JCC with distutils/setuptools/distribute/pip/etc... is to make it into a
>> distutils 'compiler'. This requires some work, though, and I haven't done
>> it in all thee years. Anyone with the itch to hack on distutils is welcome
>> to take that on.
>
> I'm afraid I don't fully understand how distutils works, it seems to be
> sparsely documented, and I don't have a lot of time and energy to work on
> refactoring jcc. I am a bit surprised that we can't just generate a source
> distribution containing the jars, .cpp files and a setup.py which does the
> rest like any other Python extension.

Same here. I don't know distutils too well and whenever I tried to dig into
it, I quickly gave up. I don't know what it means to "just generate a source
distribution".

If they contain .class files, JAR files are not source files. My
understanding could be wrong here, but I don't think they're even compatible
between 32- and 64-bit VMs. Or is that incompatible between Java 5 and 6 ?

>
A configure script for building libjcc.dylib (libjcc.so on Linux, jcc.dll on
Windows, etc...) would take care of doing what setuptools + the issue43
patch is doing for us currently: invoking the C++ compiler and linker
against the correct Python headers and Libraries to produce a vanilla shared
library. With such a contribute script, there is no longer a need to patch
setuptools.
Please, file a bug with the explanation above. Not that I promise to fix it
(a patch would be welcome, of course) but this failure should be logged at
least.
That'd be possible but has the potential of being very noisy...
It's worth a try, for sure.

Andi..