RoleLarge scale data analytics
TechnologyJAVA, PHP – Hadoop : HDFS, MapReduce, HBase, Hive – Membase – Memcache – MySQL and more.
Transparency drives trust.
And a deep sense of loyalty.
A leading independent software platform that enables brands and agencies to plan, buy, measure and optimize their global advertising wanted to improve transparency by leveraging real-time user data across television, video, display, mobile, social and other brand advertising initiatives. This requires an ability to process information to identify trends and gain insights quickly.
While, venturing into an unknown terrain is risky.
It also brings good fortunes.
We began this project when the Hadoop ecosystem was still in infancy. Every visitor was identified as new or returning from the http queries during batch processing Hadoop jobs. Oozie was still in its early stage and there was a homegrown workflow engine for troubleshooting that ran into distributed computing issues in lock management and scheduling. The project involved managing 500 servers spread over 4 regions. In the performance troubleshooting of serialization and compression layers, the Ints, Longs and Collections in Java introduce significant size overheads unless handled at job design.
Sky is the limit.
Even when technology is not on your side.
As the stability and performance issues were solved, engagement reports were published at higher frequencies, micro trends could be measured and used in targeting. For example, a tweet can cause one of the topics to be more searched that makes some inventories hotter than others which is important information for ad targeting.
If the same information is available at a later period, leveraging it may not be as effective as the trends may be short-lived. This translated into customer success with some of the key adopters of the platform especially in the impact and reach reports for advertisers.