Developer Catchup: Synchronous Node, Serviced Polyfills, Sparks Sparked, Tangrams Mapped and SHAaaaaaa!

developercatchupNode.js synchronously: Node.js is sweet if you can adapt to the asynchronous model of start thing, say what you want to do when its done, do everything else anyway. Good for web request handling but bleh for trying to emulate a shellscript. Turns out that in Node.js 0.12 (coming soon? anyone? Bueller?) we get synchronous child processes to now you can run that curl or find or whatever and just wait till its returned with its results. The folks at Strongloop have written about these synchronous child process methods and how they make writing command line utilities in Node easier. Check it out Noders.

Serviced Polyfills: Polyfills fill gaps in browser functionality and standards compliance. The older the browser, the more Polyfill you need to fill the gaps and the newer, the less. But it gets hard working out how much Polyfill you are going to need. Fear not, as Samuel Giles at FT Labs has an answer, “Polyfills as a Service“. Add a simple script tag pointing at a source from the polyfill.io content delivery network to your pages and whatever browser views your page, it gets the polyfill it needs. This is because the system sniffs the browser agent and works out the best set of polyfill based on that. Neat idea, potentially very handy – and you can run your own private version if you need to.

Spark sparks: Apache Spark just got a 1.1 release. Spark is Hadoop data processing engine which can run on YARN-based Hadoop clusters or in standalone mode. Spark 1.1 improves the performance (and they already say they are up to 100 times faster than Hadoop MapReduce) and has SQL layer enhancements. 1.1 also adds more statistical functions, can take steaming data fromAmazon Kinesis and pull data from Apache Flume and more. If your into clusters and data crunching and haven’t looked at Spark, you might want to look into it.

Tangram Mapping: Do you want to render cool 2D and 3D maps? Check out Tangram, a Mapping Library then as it is building out from a WebGL implementation to other OpenGL platforms to make oodly cool dynamic map renders. Very slick.

SHAaaaaaa!: We mention the Google sunsetting of SHA-1 the other week. If you were unsure why this was important, can we send you off to Why Google is Hurrying the Web to Kill SHA-1 which explains why it all and includes a brief history of collision attacks in the wild.

Tails goes 1.0, Debian goes 7.5 and Apache OO goes 4.1

Snippets
Tails 1.0: The developers of Tails, the Linux distro built for anonymity and privacy, have declared the latest version Tails 1.0. Tails wires all its networking through Tor and leaves no traces on machines where its been livebooted. Its ideal in situations where you want your digital footprint minimised. Version 1.0 sees browser updates, Tor patches including a Heartbleed vulnerable blacklist, bug fixes and a new logo for the project. The announcement also lays out plans for 1.1 (A switch to Debian 7), 2.0 (better building for a longer life) and 3.0 (sandboxing and isolation) and invites developers to contribute… it is a project which has got some great reviews.

Debian 7.5: Talking about Debian, the latest bugfix and patch rollup release, Debian 7.5 has just arrived. If you keep your Debian system up to date, you’re already good, but if you install a lot of systems from spinning or stickish media then you may want to take this opportunity to update your images. Full details of the fixes, bug and security, are in the announcement.

Apache OpenOffice 4.1: The Apache OpenOffice project has announced AOo 4.1, the latest iteration in the direct descendent of the original OpenOffice. The release notes highlight the Windows version’s IAccessible2 support for better screen reader integration and the addition of comments and annotations for text ranges. In place field editing, interactive cropping, unified import/drag/drop for images, better vectors and new (Bulgarian, Danish, Hebrew, Hindi, Thai and Norwegian Bokmal) translations and other updated translations and dictionaries. Also, behind the scenes, AOo now uses NSS libraries rather than the older Mozilla networking code so that it is a bit more secure and a lot easier to build.

LXC’s 1.0, Thrift opened again, WhatsApp serving and more – Snippets

Snippets.png

LXC goes 1.0: Linux Containers, LXC, is now at version 1.0, a major milestone which also brings together and completes a lot of things that have been working their way through the Linux kernel, like support for unprivileged containers, long term stuff like a stable API – this’ll be supported for five years, bindings for Lua and Python3 (and Go and Ruby out-of-tree support), backing storage support for directories, btrfs, zfs and more, cloning, snapshotting… and you may wonder “Hey, doesn’t Docker do many of these things” and yes it does, so it’ll be interesting to watch how things all work out. More details at the news post and check out Stephane Graber’s 10 part blog series on LXC 1.0 which is packed full of useful stuff.

Thrift double opened: Facebook brought Thrift(PDF) to the world in 2007 via Apache Thrift and many people found the network/data serialisation framework well handy. Thing is though that Facebook went and forked their own internal version of Thrift as they filled out the features and ramped up performance, something that took major rengineering over time. Now the company has announced fbthrift, available on Facebook’s Github repo, now open sourced under the same Apache 2.0 licence Apache Thrift is under.

Worth reading: WhatsApp’s Serving : From 2012, here’s a presentation on how WhatsApp does scale(PDF) with a combination of FreeBSD and Erlang – A New York Times profile of security reporter Brian Krebs who’s more like an entire security intel op in one person – Enjoy Stephen Colebourne on video presenting the Java 8’s Date and Time API at JAX 2013.

Lime editor, HBase 96, Font Awesome and MOON LASERS – Snippets

Snippets

  • Lime text editor: People love the Sublime Text editor. But being closed source does set some folks worrying. Some of them do something about it though, such as “quarnster” who has been creating Lime as an open source version of Sublime Text. Built with a combination of Go 1.1, Python3, Oniguruma and optional Qt5, Lime still has plenty to implement, including compatibility with Sublime’s Python API, keybindings and snippets, TextMate Snippits and getting solid cross platform support. But if you are looking for a project to work on…
  • HBase 96 arrives: The Hadoop-based “big data” database, HBase has been updated to HBase 0.96 with around 2000 issues closed and lots of contributed work. This included getting the MTTR (Mean-time-to-recovery) down to under a minute, support for snapshotting tables then moving and restoring snapshots, Cygwin-free native support on Windows, more efficient compacting, a switch to Google’s ProtocolBufffers (in part for futureproofing) and much more. There’s also a bunch of incompatible changes so do check the notes. Find the release and the release notes on the Apache Software Foundations pages.
  • Font Awesome 4.0: A font of icons? Yes, the rather spiffy Font Awesome is back with an even more awesome version 4.0, now with 370 icons in a single collection. Designed for Bootstrap 3.0.0, styled with CSS and free for commercial use. Check out the sample page and examples. And yes, you can use it without Bootstrap too.
  • And finally: NASA just announced they have got a 622Mbps download rate from the Lunar Laser Communication Demonstration. It’s asymmetric though… 20Mbps upload, but hey, to the Moon.

Cassandra’s Europe Summit – The Keynote – Extra Scaling

cassandraeyeAt the opening of the conference day at Cassandra Summit Europe 2013, Johnathan Ellis, Datastax CTO, made a point of positioning Apache Cassandra as an enterprise scalable database and one that scales in a linear fashion to massive scales. Datastax is the leading developer of, and commercial vendor of Apache Cassandra in the form of DataStax enterprise.

MongoDB was very much in the company’s sights as it showed benchmarks with Cassandra running 20 times faster than MongoDB – the reason was simple though the dataset for the benchmark was bigger than the available memory on the nodes. While MongoDB performs well with the dataset in memory, Ellis says most customers want their hot-data in memory and their cold-data on disk and thats where Cassandra has the advantage with a balanced approach to memory and disk.

Away from the benchmarking, Ellis described this years focus for Cassandra as having been on was of use. That meant enhanced CQL, the Cassandra Query Language, a new CQL protocol for language drivers, more emphasis on features like tracing, lightweight transactions for the 1% of cases that need it and cursors to reduce query complexity.

Internal enhancements were equally important though. For example, 2.0 took back control of a lot of memory management in Cassandra, from the JVM and over to a more traditionally manually handled memory manager tuned for Cassandra’s needs. This has allowed lots of data structures to reside more efficiently in memory improving performance.

Next week will see the release of Cassandra 2.0.2 which will add what the DataStax people call “rapid read protection”. This means that when a query goes out to a cluster, rather than waiting until a node times out to return an error, the system will look for return times that are out of the ordinary (in the 99th percentile) and return an error on them early. This should make the ability to respond to nodes over-paused in GC or suffering some other performance hit.

Ellis also talked about Cassandra 2.1 which is pencilled in for January 2014. This will see nesting and collection indexing added to the database. The filtering inside the Cassandra software should also be improved with a new combination of pessimistic allocation and smarter estimates of required space using HyperLogLog to work out what data overlaps between sets. Ellis described his slides in this though as “hand wavy” as there was no code written yet and asked “Don’t send me hate mail…” if it didn’t make 2.1.

DataStax’s own certified DataStax Enterprise is set to move to a Cassandra 2.0 base by the end of the year.

Talend go Apache, Mozilla and Xiph, Oracle and Java and Virtualbox updates – Snippets

Snippets

  • Talend go Apache: Talend, makers of integration, ETL and other data management products, have long been proponents of the GPL license for their products. I’ve asked them about this in the past and they’ve been robust in their reasoning about why the GPL is right for them. It appears though that that era has come to an end with an announcement that the company will be stepping towards more permissive licensing. They first plan to move to LGPL with version 5.4 of their products then to Apache in 2014. They’ve been steadily exposed to permissive licensing as they have built Talend ESB on Apache projects and when they went to release “Talend Open Studio for Big Data” they decided to go with Apache for better compatibility with the Hadoop ecosystem. That product, they say, is “arguably the most adopted product from Talend, ever” and that inspired a licensing rethink. An interesting change (and if you’ve not looked at Talend’s software, check it out… there’s some powerful integration mojo in there).
  • Mozilla’s new video hire: Xiph.org founder Monty Montgomery is off to Mozilla amicably leaving his current employer, Red Hat, for a chance to work at Mozilla with the other Xiph developers. Current work in progress is the Dalaa video codec which is setting out to be a free to implement and use, and technically superior alternative to h.265 and Google’s VP9. Mozilla is primary sponsor on the project and talking to Gigaom, Montgomery says progress on Dalaa is solid and there could be commercial products using it by the end of 2015. It looks like Mozilla are making sure that they aren’t caught again between a rock (h.264) and a hard place (VP8) in the future.
  • Oracle and Java fix time: It’s time for Oracle to drop its metric shedload of fixes for October. Short version, there’s a Java 7 update 45 (release notes) now available with security fixes for 51 vulnerabilities nearly all of which are remotely exploitable and with eleven scoring the full 10.0 on CVSS scores and nine scoring 9.3. Typically, most Java holes are around the sandbox, WebStart and applets, but two of the 10.0 critical holes affect servers too. Update your Java 7; if you are still on Java 6, you now have two problems.
  • VirtualBox 4.3: A new version of Oracle’s open source VirtualBox has arrived. The changes in version 4.3 are sufficient for it to be called a major update. The VT-x code and AMD-V code, the guts of the virtualisation, has been rewritten to fix bugs and improve performance. There’s a new instruction interpreter that can step in when hardware virtualisation isn’t able to handle something. New notifications, better keyboard short cuts and support for video capture have been added to the GUI while support for emulating USB touch devices, webcam passthrough and SCSI CD-ROM emulation have also been added. There is also a new virtual router mode which lets multiple VMs share one NAT service. And obviously, there’s oodles of bug fixes.

Apache Lucene and Solr go 4.5

solrThe text-search library Lucene and Solr, the search platform built on top of it, have both been updated to version 4.5. Version 4.4 came out in July so what’s changed in this version bump?

Well, first of all, for Lucene, the DocValues mechanism which allows typed storage to be associated with documents has been updated to allow for missing values and there’s now an in-memory supporting DocIDSet which is more efficient for carrying around smaller lists of documents. Other changes can be found in the Lucene 4.5 release notes.

Solr 4.5, as usual, benefits and supports these changes as it is built on Lucene, but the search platform has also had its own set of improvements. For example, when running a sharded cluster, its possible to now set up custom routing to the various shards, including routing based on field values. Faceted searches are now multi-threaded, the solr.xml configuration file is now storable in ZooKeeper and the CloudSolrServer has the ability to send updates directly to shard leaders. Again, more details are available in the Solr 4.5 release notes and the PDF of the updated Solr reference guide is available through the Apache mirrors. Both Lucene and Solr also have various bugfixes and performance improvements.