These were initially suggested for Google Summer of Code 2011 (GSoC), but they're also relevant to our own Summer of Security program, and we may continue to update and reuse this same page after 2011 as well. When applying to us for GSoC or otherwise, please use our application template (which also includes info on how to contact us).
Although we have a lot of ideas listed here, our mentoring capacity is limited and some of the ideas would be incompatible if worked on during the same summer. Thus, in 2011 we intend to work on a subset of these ideas only.
Openwall GNU/*/Linux (or simply Owl) is our security-hardened Linux distro for servers, currently at (and beyond) version 3.0. We have a nearly perfect userland in terms of privilege reduction and privilege separation of/in individual programs/services. Specifically, Owl 3.0 is the very first Linux distro to have no SUID programs in the default install (yet be usable). However, further work is needed (towards Owl 4.0, which we should release in 2011 or early 2012).
We can reasonably accept and work with several GSoC students on the Owl tasks below. Although the separation between the task categories is not exact, here are three potential roles that students are invited to apply for:
Owl: new functionality
Owl: updates to existing functionality
Owl: security - claimed by: Vasiliy Kulikov under GSoC 2011
Each of these focuses on the corresponding one of the first three task categories identified below. The documentation improvements do not correspond to a separate student role; they may be worked on by any of the students or/and by others on our team as time permits.
Functionality available out of the box needs to be expanded in multiple ways, including:
Need to have full LAMP stack in the base system. We need to add Apache, MySQL,
PHP - and do so in accordance with our project concepts, which will include some security-relevant changes.
PPP/PPPoE/PPTP and DHCP client support (add userland packages, introduce privilege separation where needed)
Assorted extra packages that are in line with typical uses, concepts, and goals of Owl
Support and setup a package repository (for easier
updates), possibly with Zypper, yum, or apt
The system should be brought more up to date:
New GNU toolchain, and new upstream software versions in general
RHEL6 binary and package compatibility where this does not conflict with our other goals
OpenVZ kernels from their “rhel6” branch (Owl 3.0 uses “rhel5” branch)
(Better) support for: IPv6 (in network startup scripts and installer), UTF-8 (in many places), GPT (disk partitions beyond 2 TB), etc.
System security should be improved further:
The “rhel6” branch OpenVZ kernel that we'd update to will need to be security-hardened, in part by reviewing, extracting, cleaning up, porting, and documenting/commenting individual changes from
grsecurity and PaX (some of which have originated from
Openwall's patches for older kernels), and in part by implementing new security-related changes/features, some of those specific to container-based virtualization (purpose-specific restrictions to be applied on per-container basis). We expect help/consulting/mentoring from the author of PaX on portions that are PaX (some of these are difficult to understand from the code alone, especially the rationale behind things being done in a certain way), whereas the rest are not too complicated for a capable person to fully figure out on their own. References:
1,
2
We should work with upstreams - OpenVZ and Red Hat - to try and get some of these enhancements accepted
If time permits and this sub-task is not claimed by another person, the same person could also port the individual changes to mainstream kernels and work with LKML - although we're also listing this as a separate task below (so if claimed by another student, it will be worked on independently)
The gcc options used to build the userland will need to be adjusted (globally, but with some per-package exceptions) to maximize the effect of
ASLR and to harden the programs in other ways. (This is what some other hardened distros did while we were focusing on re-working privilege management in our userland, which they did not do. Now we need to catch up, and we'll be ahead of them overall. This should be a lot easier than the work we did so far.) References:
1,
2,
3,
4
Documentation should be improved.
We should “complete” and publish a User Guide, covering not only specifics of Owl, but its use in general.
Per-package info on Owl specifics may be added (including explanations of how and in what ways certain packages on Owl are more secure than their “equivalents” found in other distros - e.g., how our syslogd runs as non-root and verifies/logs credentials of local message senders).
More web pages may be added: a packages directory (based on
RPM metadata and the per-package text files mentioned above), man pages (some of these are Owl-specific).
John the Ripper is a popular Open Source and cross-platform password cracker (password security auditing tool). Its homepage has exceeded 15 million hits.
There is a little bit of overlap between some of the JtR tasks below. A compatible subset of the tasks (with no overlap and no dependencies of any one of the tasks on another) will need to be picked and the tasks' scope adjusted for GSoC to match the student applications we receive.
We can work with multiple students on a subset of these tasks, leaving the rest of the tasks for further occasions (such as for next year's GSoC). Students are welcome to apply for the following roles, which directly correspond to the tasks below:
JtR: support more non-hashes (
notes) - claimed by: Dhiru Kholia under GSoC 2011
JtR: automatic rule set generation
JtR: distributed processing
JtR: parallel processing
JtR: GPU for slow hashes - claimed by: Lukas Odzioba under GSoC 2011
JtR: GPU for fast hashes
-
JtR: integration of contributions
Support for more things beyond password hashes: Mac
OS X FileVault and/or keychain passwords (these use PBKDF2), WEP & WPA-PSK passphrase (part of functionality of
aircrack-ng available right inside JtR),
SSH and PGP secret key passphrase, ZIP archive passwords (several kinds of them exist), …
Further research on and implementation of automatic rule set generation based on previously-cracked passwords. References:
1 2
Distributed processing, including a possible sub-task:
Greater interaction with running cracking sessions (e.g., with an ircII-like ncurses interface or/and a
GUI) - such as to add/remove nodes on the fly
Parallel processing (on one node) - not just further work on OpenMP support (initially integrated in 1.7.6), but also other approaches (not specific to individual hash types and achieving greater efficiency than OpenMP can provide for “non-slow” hashes)
GPU support for “slow” hashes (does not require changes to JtR interfaces and program structure)
GPU support for “fast” hashes (requires some invasive changes to achieve good efficiency)
GUI, likely using
wxWidgets in C++: as wrapper around the command-line program and/or integrated (will require/provide greater interaction)
Integration of more hashes/ciphers/features/optimizations from the jumbo patch (and other user-contributed patches) into the official JtR - requires code cleanups, portability enhancements and testing, clearing up potential licensing issues (in some cases), etc. - or reimplementation
We have a number of project ideas, where the student's role would correspond to completion of an entire software, research, and/or “community” project independent from our existing larger projects (albeit closely related to our activities in general).
Here are the short “role names” for individual tasks briefly described below. Please use these short names when you apply to work on one of the tasks.
blists development
-
-
New password hashing method - claimed by: Yuri Gonzaga Gonçalves da Costa under GSoC 2011
Bitslice DES
Bitslice MD*/SHA*
Virtual distributed vector computer
-
Own creative and relevant idea
blists is our web-based interface to mailing list archives. It works off pre-indexed mbox files. This approach enables it to be extremely fast and lightweight: messages are located instantly (in at most a few disk seeks) and there's no need to cache pre-generated
HTML page bodies. Even though we're making use of blists already (for publishing our own, hosted, and some third-party mailing lists on the web), it needs a lot more work (yet we failed to find time for work on it lately). Some of the things to add are: index pages with message Subjects, thread view, a search feature. These will require changes to the index file format, and the search feature is especially non-trivial to design and implement.
Linux kernel hardening - extract security hardening changes from various patches (which the mentor will point out), forward-port them to the latest mainstream kernels, make it easy to enable/disable the hardening measures (both compile- and runtime), add documentation, properly submit to and work with LKML (make proposals and own discussions to completion: either rejection or acceptance). This is a noble but thankless job to do, so be prepared! The authors of those changes did not submit them “properly” and did not “own discussions to completion” precisely because the job is so thankless.

Get better password security features into
PHP proper (the
PHP interpreter)
-
New crypt(3) flavor using concepts of
scrypt (not only iterations, but also parallelism and memory), optionally making use of
AES-NI
GPU or/and FPGA accelerated password hashing on servers (to better compete with similarly accelerated or distributed offline password hash cracking), optional local parameterization on specific hardware (parameter unreadable from host
OS)
-
We have some achievements in generating more optimal
DES S-box expressions for bitslice implementations (yes, ours have fewer gates than Matthew's for a comparable set of logic operations), which we intend to make use of in JtR soon. A possible task for a student would be further work on this: automated code generation/optimization (we've done some of this already, but there's more to do), community distributed processing project (with “agents” working on portions of the task), potential application to AES, paper on the approach, code cleanups of programs used to generate the S-box expressions and code, and public release of these programs.
Bitslice MD5 (
revised), MD4, SHA* on
AVX (apparently, that's the only way to use AVX 256-bit vectors for these hash types; otherwise, we're limited to 128-bit): research, testing, integration into JtR, paper
“Virtual distributed vector computer”: no native machine code distributed to nodes (good for security), yet near native performance should be possible for suitable tasks (such as bitslice implementations of ciphers applied to key search attacks) given efficient implementation of “agents” for their target machine architectures
This task involves research, design, implementation, testing, practical use example, and a publication
This might be partially hampered by
US Patent 5946496 (expiring in 2017);
authors' web page. We have not reviewed these yet (found them when searching for possible existing implementations of the idea, which we did not find).
Develop intense unit tests for all libc interfaces, and test
musl and other C libraries for correctness. Ideally the tests would be resilient against missing or buggy interfaces halting the test, so that partial results could be obtained even on certain incomplete or/and highly buggy C libraries.
This will be mentored by the author of
musl.
Your own creative and relevant idea - please propose it to us first, then describe it again when you apply
With few exceptions (such as for changes to existing Linux kernel code, which is under GPLv2 anyway), we require any contributed code to be made available under a cut-down BSD license. The wiki page linked from here is for JtR, but we'd like to use this approach for most other projects as well. By applying to work on one of the ideas with us, you signify your acceptance of these terms and your intent to license your code contributions accordingly.
This approach permits us to combine contributed code with differently-licensed third-party code, and it does not lock us to a specific Open Source license for our releases.
Additionally, it permits us to create and sell proprietary revisions of our programs, which we're currently doing with JtR Pro. As you can see from the feature sets of free JtR vs. JtR Pro, we're not abusing this ability to artificially cripple our free software. The free JtR remains the main one, where features get implemented first, with “Pro” being branched off some free versions for those users who prefer a pre-packaged “product”. Overall, the introduction of JtR Pro has helped development of the free JtR so far, by letting us spend more time on the project (vs. doing more client-facing work on other projects). We assume that students applying for work on JtR are comfortable with this.