Filter by tags:
September 23, 2010
I believe that, over time, Jaspersoft’s distinction will be less about it being an open source software company and more about its abilities as a great business intelligence software company. I expect declining distinction for our open source-ness will partly occur because the success of open source software and the benefit it brings the community and customers become better accepted and understood each year (and, therefore, less unique). I also believe that the most valuable aspect of the open source model will long endure, way after the sheen fades from the download, forum post or roadmap voting. That is, the principles of open source software are its most distinguishing characteristic and will eventually reach not just all technology companies, but all other industries as well.
Doing the right thing when no one is watching may be the best definition of integrity. You combine that with frankness and honesty and you have the first open source principle, Transparency. With open source software, anyone can watch. Jaspersoft software engineers and our community contributors know that every line of code they write will be made available for inspection and comment by a very large community. If they have any discomfort with transparency, they would choose a different vocation.
Actively giving back in a very tangible way is the heart of participation. Making the open source projects, of which each community member is part, more successful and more capable should be the common goal. Giving back can mean many things, including and especially either committing time through code contributions (for those community members with the skill and expertise) or purchasing / licensing the software if the project is in any way commercial open source. Code contributions can include not just feature advancements, but language translations, bug fixes, and quality assurance testing assistance, among others.
Open source community distinction emerges because its members participate by using either their time (i.e., skill) or their money. Either is valuable and helps to make the open source project thrive. The only sin in open source is not participating. In other words, if a community member is using open source software and deriving real benefit from its existence, then participating by providing time or money should be seen as basic and reasonable reciprocity.
Collaboration is about collective engagement for the common good and is the fastest route to open source project success. If an open source project is a neighborhood, then collaboration is the barn raising. Distinguishing this from “participation”, collaboration is about helping others in the community because doing so advances the project and its usefulness for everyone.
My favorite example of collaboration is knowledge sharing through forums, blogs and idea exchanges (in some circles, called ideagoras). On JasperForge, Jaspersoft’s open source community web site, there are more than 160,000 registered members who have collectively offered nearly 80,000 forum entries across all the listed top-level projects. The variety of questions and issues being addressed by and for community members within the forums is staggering. And, the vibrancy that emerges through this exchange of skill is core to large-scale community success.
While forum activity remains brisk, I’m equally proud of our guided use of an idea exchange within JasperForge. Each top-level project includes a roadmap where community members can comment and vote on planned features. This not only allows many voices to be heard, but provides a valuable calibration for Jaspersoft and its community, ultimately yielding the most important product features and advancements in approximately the best priority order.
There are many more examples of collaboration in action, across JasperForge and other leading open source sites, but these are some of my favorites.
I talk about these three principles of open source regularly, and I’m fond of concluding that the real benefit of collaboration accrues to those who participate transparently. That’s just my clever way of mentioning all three the open source principles in one actionable sentence. What are your favorite examples of these open source principles in action? Your thoughts and comments are always welcome.
Chief Executive Officer
September 23, 2010
August 9, 2010
For this blog post, I’ll describe which technologies will likely fuel these changing usage patterns and some product categories that will, therefore, get a boost.
These are data stores that use sophisticated indexing, compression, columnar, and/or other technologies to deliver fast querying for large data sets. Increasingly, newer entrants in this category are less expensive that their enterprise data warehouse and OLTP counterparts. Although natively these databases require structured data formats, they provide a tremendous new capability to deal with large data volumes affordably and with greater processing power. When combined with a sophisticated analytic tool (such as a ROLAP engine or in-memory analysis techniques), an analytic database can deliver speed, volume, and sophisticated multi-dimensional insight – a powerful combination. For more on this product category, check out this prior post.
Distributed Data Processing via Hadoop
Large volumes of distributed data, typically generated through web activity and transactions, is the fastest growing data type. This data is commonly unstructured, semi-structured or complex, and holds great promise for delivering keen business insight if tapped properly. With the open source project Hadoop, and some upstart open source companies working to commercialize it, that previously untapped information capital is now ready to be unlocked. By enabling massive sets of complex data to be manipulated in parallel processes, Hadoop provides businesses a powerful new tool to perform “big data” analysis to find trends and act on data previously out-of-reach. Increasingly, big data will be a big deal and this is an important area to watch.
Complex Event Processing
On their own, large data volumes already create difficult analytic challenges. When that data is being created and updated rapidly (even imperceptibly to humans), a different approach to analysis is required. CEP tools monitor streaming data looking for events to help identify otherwise imperceptible patterns. I’ve referred to this technological concept elsewhere as the converse of traditional ad hoc analysis where the data persists and the queries are dynamic. With CEP, in a sense, the query persists and the data is dynamic. You can expect CEP-based, dynamic data analysis functionality to become more interesting and capable across a wider variety of uses each year.
More simple, integrated, multi-dimensional views of data should not be available only to those who spent two weeks in a special class (think ROLAP or MOLAP). They should exist alongside your favorite bar or line chart and tabular view of data. The analysis should also be constructed for you by the server, persist in memory as long as you need it (and no longer), and then get out of your way when finished. Interacting with it should be as straightforward as navigating a hyperlink report and pivot table -- although a variety of cross-tab types, charts, maps, gauges and widgets should be available for you to do so.
Ever since IBM acquired SPSS, statistical modeling is cool again (since when is IBM cool, btw?). The truth is that the natural progression when analyzing past data is to project it forward. With the need to deal with larger volumes of data and at lower latency, it stands to reason that predicting future results becomes more important. This is why I believe the R revolution is here to stay (R is the open source statistical analysis tool used by many in the academic and scientific world). I predict a growing commercial need for this open source juggernaut, and by this I mean a growing demand for tools based on R with more robust features and a commercial business model – and a few software companies are delivering.
If you follow the Open Book on BI, you know I’m a big fan of mash-up dashboards. I expect these flexible, web-based constructs to deliver the most pervasive set of contextually relevant data, gaining broader use and enabling better decisions even without fancy predictive tools (although the output from a statistical model should be embeddable within a mashboard, maintaining its link back to the model and data source along with any relevant filters). Earlier this year, I wrote an article about making better, faster decisions through the clever use of mashboards. Making those good decisions is about understanding the past and recognizing current patterns, all while understanding the proper context. These relevant visual data elements should come together in a single, navigable view. Perfect for a mashboard.
So, this is my short list of business intelligence product categories and technologies that stand to gain substantially in the next few years. Surely I’ve not covered them all so your comments and feedback are encouraged.
Chief Executive Officer