-
Notifications
You must be signed in to change notification settings - Fork 327
Contributing to Shark
A few notes that everyone should review before contributing to Shark.
- Running Shark Locally
- Run Shark regression tests (see Developer Guide)
- Running Shark on EC2
- Read Cliff's master project report
- Read Spark NSDI paper
- Read Hive Developer Guide
- Read Hive Queries on Tables for Hive QL examples
Learning Scala page provides a good list of resources. I recommend the following things documented on that page:
- A Scala Tutorial for Java Programmers: A 20 page introduction to scala and some of the basic concepts and a good place to start. More code examples here.
- Simply Scala: a web site where you can interactively try Scala. There you will find a tutorial that gives a rapid overview of the basic language features, the syntax, examples you can run and the ability to try your own code with an interactive interpreter.
- Scala by Example (PDF): Takes you through the Scala features with many examples. It does assume that you are already familiar with the basic Scala syntax and a basic understanding of functional programming. It is an excellent way to expand your knowledge and skill.
- Tour of Scala: Here is a more descriptive, yet formal, summary of the Scala language features with many code examples. A great language reference for programmers needing to check correct use of a specific Scala feature or its correct syntax. Once you have mastered the basic Scala syntax then this is a good place to look to learn specific features.
Some other great places to learn Scala are Scala School and Effective Scala.
... are submitted to the Shark JIRA issue tracker, or the mailing list for Shark users.
- Break your work into small, single-purpose patches if possible. It’s much harder to merge in a large change with a lot of disjoint features.
- Submit the patch as a GitHub pull request. For a tutorial, see the GitHub guides on forking a repo and sending a pull request.
- Follow the style of the existing codebase. We follow the standard Scala style guide, but with a few changes detailed in the Spark style guide
- For the occasional bits of Java, we use Sun's conventions, with the following changes:
- Indent two spaces per level, not four.
- Maximum line length of 100 characters.
- Imports should be placed in the following order:
-
import java
... -
import scala
... -
import thirdparty
... -
import shark
... - These groupings should be separated by an empty line and ordered alphabetically. Subpackages within each group should be also be in alphabetical order.
- Make sure that your code passes the unit tests.
Contributions via GitHub pull requests are gladly accepted from their original author. Along with any pull requests, please state that the contribution is your original work and that you license the work to the project under the project’s open source license. Whether or not you state this explicitly, by submitting any copyrighted material via pull request, email, or other means you agree to license the material under the project’s open source license and warrant that you have the legal authority to do so.