|
|
-
Re: [Lucy] Snapshot: "segments" key
Peter Karman 2010-03-27, 01:55
Marvin Humphrey wrote on 3/26/10 4:04 PM: > After the change, it would look like this: > > { > "entries" : [ > "schema_3.json", > ], > "segments" : [ > "seg_3" > ], > "format" : "1" > } > sounds good. -- Peter Karman . http://peknet.com/ . [EMAIL PROTECTED]
+
Peter Karman 2010-03-27, 01:55
-
Re: [Lucy] Snapshot: "segments" key
Marvin Humphrey 2010-03-27, 20:47
On Fri, Mar 26, 2010 at 08:55:09PM -0500, Peter Karman wrote: > > After the change, it would look like this: > > > > { > > "entries" : [ > > "schema_3.json", > > ], > > "segments" : [ > > "seg_3" > > ], > > "format" : "1" > > } > > > > sounds good.
Thanks, and you're right, that would have been a fine solution. However, I discovered while trying to implement it that things were a little more complicated than I'd thought, so I revisited the simplest of the alternates I'd originally discarded -- and I believe I've arrived at an improvement.
(It's always worthwhile to spend extra time and effort exploring simple solutions when it comes to file formats, since file formats impose such a heavy backwards compatibility burden.)
The new approach is to have directories listed in a snapshot file serve as stand-ins for their contents -- so when the snapshot is in use, they protect their contents recursively, and when the snapshot becomes obsolete, their contents becomes subject to recursive deletion.
I originally thought to implement this by calling Folder_Delete_Tree() on each entry in the snapshot and ruled it out because it seemed too draconian. But Delete_Tree() turns out not to have been the right way to go about things anyway, and the final implementation is much more sane.
The advantage of this approach is simplicity: Snapshot files continue to be only a single flat list, as there's no need to add a new "segments" key. Even better, snapshot files contain vastly fewer entries after this change, making them easier to read.
{ "entries" : [ "schema_3.json", "seg_3" ], "format" : "1" }
Yet we still achieve our desired result of avoiding the need to implement Delete_Segment() for all DataWriters. Writing plugins just got a little easier. :)
To make this approach safe security-wise, I had to take a couple steps. First, I had to make sure that we don't follow symlinks when recursing so that e.g. a maliciously placed symlink to "~" can't wreak havok. Second, I had to ensure that we don't follow filepaths upwards out of the index directory, so that a maliciously crafted snapshot file containing e.g. "../../../" can't do any harm.
I actually uncovered an existing minor security vulnerability during this work: under the current release, a maliciously crafted snapshot file containing variants on e.g. "../../../../etc/passwd" could cause problems if all the stars align. An attacker would first need to supply a malicious index, which that most KinoSearch users aren't vulnerable since few if any use untrusted indexes. Then there would need to be a permissions mistake for an attacker to do something like bring down a box. However, if the attacker guesses correctly on filepaths belonging to a user ("../../mail", "../../../mail"), they could do some damage. There's no limit on the number of entries in a snapshot file, so an attacker would be free to make a very large number of guesses.
We should probably make a 0.30_091 bugfix release to get that vulnerability taken care of, as well as the FreeBSD alloca() problem that's showing up in CPAN testers (solution is to #include <stdlib.h> to get alloca() on that platform).
The mods I've made since 0.30_09 dropped are all backwards compatible -- even with the changes to FilePurger implementing the snapshot file semantics change, we won't break compat until we empty out the Delete_Segment() routines for all of the DataWriter subcomponents.
I'll probably get the alloca() problem sorted with a Charmonizer patch later today.
Marvin Humphrey
+
Marvin Humphrey 2010-03-27, 20:47
-
Re: [Lucy] Snapshot: "segments" key
Marvin Humphrey 2010-03-27, 23:54
On Sat, Mar 27, 2010 at 01:47:40PM -0700, Marvin Humphrey wrote: > I'll probably get the alloca() problem sorted with a Charmonizer patch later > today. Done. That should solve this FreeBSD 6.x problem: http://www.cpantesters.org/cpan/report/7014815Wish I had access to a FreeBSD 8.x box so I could figure out what went wrong here: http://www.cpantesters.org/cpan/report/7013414It looks like a memory error somewhere in Charmonizer, but I can't reveal it by running "CHARM_VALGRIND=1 ./Build charmony" on OS X the way I could other charmonizer memory errors that were plaguing FreeBSD 8.x. Marvin Humphrey
+
Marvin Humphrey 2010-03-27, 23:54
+
Peter Karman 2010-03-28, 02:31
-
Re: [Lucy] Snapshot: "segments" key
Marvin Humphrey 2010-03-28, 01:21
On Sat, Mar 27, 2010 at 01:47:40PM -0700, Marvin Humphrey wrote: > We should probably make a 0.30_091 bugfix release.
Screw it, let's just move forward. I'm just going to move ahead with the file format changes. They shouldn't take me more than a few days, max, and then we can release 0.30_10 and 0.31.
Marvin Humphrey
+
Marvin Humphrey 2010-03-28, 01:21
-
Re: [Lucy] Snapshot: "segments" key
Marvin Humphrey 2010-03-30, 00:52
On Sat, Mar 27, 2010 at 06:21:19PM -0700, Marvin Humphrey wrote: > On Sat, Mar 27, 2010 at 01:47:40PM -0700, Marvin Humphrey wrote: > > We should probably make a 0.30_091 bugfix release. > > Screw it, let's just move forward. I'm just going to move ahead with the file > format changes. They shouldn't take me more than a few days, max, and then we > can release 0.30_10 and 0.31.
I'm done with the non-forward-compatible file format changes and I'm going to push out 0.30_10.
Marvin Humphrey
+
Marvin Humphrey 2010-03-30, 00:52
|
|