<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DBASoul.com</title>
	<atom:link href="http://www.dbasoul.com/feed" rel="self" type="application/rss+xml" />
	<link>http://www.dbasoul.com</link>
	<description>Oracle and Unix for everybody</description>
	<lastBuildDate>Tue, 10 Apr 2012 14:14:02 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Easy way to install Oracle Database on Oracle Linux</title>
		<link>http://www.dbasoul.com/2012/1039.html</link>
		<comments>http://www.dbasoul.com/2012/1039.html#comments</comments>
		<pubDate>Tue, 10 Apr 2012 14:12:51 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[install]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1039</guid>
		<description><![CDATA[Overview Oracle is pleased to announce the general availability of the Oracle RDBMS Server 11gR2 Pre-Install RPM for Oracle Linux 6 x86_64 (64 Bit) architecture. This package replaces the previous Oracle Validated RPM. You can complete most preinstallation configuration tasks by using the Oracle RDBMS Server 11gR2 Pre-install RPM, available from the Unbreakable Linux Network,...  <a href="http://www.dbasoul.com/2012/1039.html" class="more-link" title="Read Easy way to install Oracle Database on Oracle Linux">Read more &#187;</a>
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2009/524.html' rel='bookmark' title='Install Oracle on Linux'>Install Oracle on Linux</a></li>
<li><a href='http://www.dbasoul.com/2010/707.html' rel='bookmark' title='Oracle De-install Utility'>Oracle De-install Utility</a></li>
<li><a href='http://www.dbasoul.com/2010/742.html' rel='bookmark' title='Oracle Database 11.2.0.2 Installation on Enterprise Linux x86 5.2 Part 1'>Oracle Database 11.2.0.2 Installation on Enterprise Linux x86 5.2 Part 1</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead">Overview</p>
<p>   Oracle is pleased to announce the general availability of the Oracle RDBMS Server 11gR2<br />
   Pre-Install RPM for Oracle Linux 6 x86_64 (64 Bit) architecture.</p>
<p>   This package replaces the previous Oracle Validated RPM.</p>
<p>   You can complete most preinstallation configuration tasks by using the Oracle RDBMS Server 11gR2<br />
   Pre-install RPM, available from the Unbreakable Linux Network, or via the Oracle Public Yum<br />
   repository.</p>
<p>   When it is installed, the Oracle RDBMS Pre-install RPM does the following:</p>
<p>    * Automatically installs any additional packages needed for installing Oracle Grid<br />
      Infrastructure and Oracle Database 11gR2 (11.2.0.3).</p>
<p>    * Creates an oracle user, and creates the oraInventory (oinstall) and OSDBA (dba) groups for<br />
      that user. For security purposes, this user has no password by default and cannot login<br />
      remotely. To enable remote login, please set a password using the &#8220;passwd&#8221; tool.</p>
<p>    * Sets and verifies sysctl.conf settings, system startup parameters, user limits, and driver<br />
      parameters to the minimum acceptable values based on recommendations from the Oracle Database<br />
      Installation Guide and the Oracle Validated Configurations program.</p>
<p> Important Note</p>
<p>    Removal of the Oracle RDBMS Server 11gR2 Pre-install RPM will restore the following files to<br />
    the version prior to inital installation:</p>
<p>    * /etc/security/limits.conf<br />
    * /boot/grub/grub.conf<br />
    * /etc/sysctl.conf</p>
<p>    If you make further changes to these files after installing the Oracle RDBMS Server 11gR2<br />
    Pre-install RPM, please make a backup of your changed files and manually restore them after the<br />
    package removal is complete.</p>
<p>x86_64:<br />
oracle-rdbms-server-11gR2-preinstall-1.0-3.el6.x86_64.rpm</p>
<p>SRPMS:</p>
<p>http://oss.oracle.com/ol6/SRPMS-updates/oracle-rdbms-server-11gR2-preinstall-1.0-3.el6.src.rpm</p>
<p>Description of changes:</p>
<p>[1.0-3.el6]<br />
- Added smartmontools as dependency, bz13653</p>
<p>[1.0-2.el6]<br />
- kernel.shmall=1073741824 as per bz7256<br />
- kernel.shmmax=4398046511104 as per bz7256<br />
- stack hard = 32768 as per doc max limit</p>
<p>[1.0-1.el6]<br />
- Renamed rpm to oracle-rdbms-server-11gR2-preinstall<br />
- Included xorg-x11-utils xorg-x11-xauth as dependency bz13653<br />
- Included kernel-uek as dependency.<br />
- fs.aio-max-nr=1048576 to match document,bz13653<br />
- kernel.shmall=2097152 to match document,bz13653<br />
- kernel.shmmax=536870912 to match document,bz13653<br />
- nofile soft = 1024 to match document,bz13653<br />
- nofile hard = 65536 to match document,bz13653<br />
- nproc soft = 2047 to match document,bz13653<br />
- nproc hard = 16384 to match document,bz13653<br />
- stack soft = 10240 to match document,bz13653<br />
- stack hard = 10240 to match document,bz13653</p>
<p>[1.0.0-3.el6]<br />
- removed util-linux and added util-linux-ng (fork of util-linux)<br />
- removed openssh and added openssh-clients bz13173<br />
- removed 32 bit dependency for x86_64 as per st docs.<br />
- removed kernel-uek-headers/kernel-headers<br />
- disable login for oracle user for bug12623491<br />
- Merge fix<br />
- Removed msgmni, msgmnb, msgmax for bz11029<br />
- Increase stack limit for oracle user bz11683<br />
- bugfix for bug11656858<br />
- added compat-libcap1 dependency bz12221<br />
- move link creation to install part bz11030<br />
- removed comment related to bugdb6820451<br />
- removed flowcontrol settings bz11508<br />
- Removed 10G related info from oracle-rdbms-server-11gR2-preinstall.param<br />
- Changed kernel.semmni to 128 as per 11203crs-cvu_prereq.xml<br />
- removed vm.min_free_kbytes<br />
- removed readme</p>
<p>[1.0.0-1.el6]<br />
- Changed requirement for x86_64 arch<br />
  /lib/libaio.so.1<br />
  libodbc.so.2()(64bit)<br />
  /usr/lib/libodbc.so.2<br />
  /usr/lib/gcc/x86_64-redhat-linux/4.4.4/libstdc++.a</p>
<p>Software Accessibility</p>
<p>   All packages are available via ULN (http://linux.oracle.com) or the Oracle Public<br />
   Yum Repository (http://public-yum.oracle.com)</p>
<p>   For more information, please visit http://linux.oracle.com</p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2009/524.html' rel='bookmark' title='Install Oracle on Linux'>Install Oracle on Linux</a></li>
<li><a href='http://www.dbasoul.com/2010/707.html' rel='bookmark' title='Oracle De-install Utility'>Oracle De-install Utility</a></li>
<li><a href='http://www.dbasoul.com/2010/742.html' rel='bookmark' title='Oracle Database 11.2.0.2 Installation on Enterprise Linux x86 5.2 Part 1'>Oracle Database 11.2.0.2 Installation on Enterprise Linux x86 5.2 Part 1</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2012/1039.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Oracle Database 11.2.0.3发布，补丁下载和文档已更新</title>
		<link>http://www.dbasoul.com/2011/1036.html</link>
		<comments>http://www.dbasoul.com/2011/1036.html#comments</comments>
		<pubDate>Sun, 25 Sep 2011 01:47:56 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[general]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1036</guid>
		<description><![CDATA[Oracle® Database Patch Set 11g Release 2 (11.2.0.3) Patch Set 2在经过1年的等待后终于发布，这次带来了Linux x86, x86-64版本的软件，其他平台的会在10月份发布，官方文档已经更新到最新。 Related posts: Oracle 11.2.0.2新特性 Oracle Metalink reference &#8211; Database Easy way to install Oracle Database on Oracle Linux
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/738.html' rel='bookmark' title='Oracle 11.2.0.2新特性'>Oracle 11.2.0.2新特性</a></li>
<li><a href='http://www.dbasoul.com/2008/86.html' rel='bookmark' title='Oracle Metalink reference &#8211; Database'>Oracle Metalink reference &#8211; Database</a></li>
<li><a href='http://www.dbasoul.com/2012/1039.html' rel='bookmark' title='Easy way to install Oracle Database on Oracle Linux'>Easy way to install Oracle Database on Oracle Linux</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead">Oracle® Database Patch Set 11g Release 2 (11.2.0.3) Patch Set 2在经过1年的等待后终于发布，这次带来了Linux x86, x86-64版本的软件，其他平台的会在10月份发布，官方文档已经更新到最新。</p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/738.html' rel='bookmark' title='Oracle 11.2.0.2新特性'>Oracle 11.2.0.2新特性</a></li>
<li><a href='http://www.dbasoul.com/2008/86.html' rel='bookmark' title='Oracle Metalink reference &#8211; Database'>Oracle Metalink reference &#8211; Database</a></li>
<li><a href='http://www.dbasoul.com/2012/1039.html' rel='bookmark' title='Easy way to install Oracle Database on Oracle Linux'>Easy way to install Oracle Database on Oracle Linux</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1036.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HP-UX 彻底被Oracle抛弃</title>
		<link>http://www.dbasoul.com/2011/1035.html</link>
		<comments>http://www.dbasoul.com/2011/1035.html#comments</comments>
		<pubDate>Sun, 25 Sep 2011 01:40:33 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[general]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1035</guid>
		<description><![CDATA[Oracle database 11.2将是HP-UX 平台的最终版本，包括Itanium和PA-RISC架构。将来就只剩下AIX、Linux、Solaris了。 参考官方文档： Oracle Database Support on the Itanium Processor Architecture [ID 1307745.1] Support Status for Oracle Database on HP-UX PA-RISC systems [ID 1313798.1] Release Schedule of Current Database Releases [ID 742060.1] Related posts: Oracle数据库产品发布计划 isqlplus desupport on 11g Oracle Enterprise Manager 10g Grid Control Certification Checker
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/703.html' rel='bookmark' title='Oracle数据库产品发布计划'>Oracle数据库产品发布计划</a></li>
<li><a href='http://www.dbasoul.com/2008/180.html' rel='bookmark' title='isqlplus desupport on 11g'>isqlplus desupport on 11g</a></li>
<li><a href='http://www.dbasoul.com/2008/123.html' rel='bookmark' title='Oracle Enterprise Manager 10g Grid Control Certification Checker'>Oracle Enterprise Manager 10g Grid Control Certification Checker</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead">Oracle database 11.2将是HP-UX 平台的最终版本，包括Itanium和PA-RISC架构。将来就只剩下AIX、Linux、Solaris了。<br />
参考官方文档：<br />
Oracle Database Support on the Itanium Processor Architecture [ID 1307745.1]<br />
Support Status for Oracle Database on HP-UX PA-RISC systems [ID 1313798.1]<br />
Release Schedule of Current Database Releases [ID 742060.1]</p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/703.html' rel='bookmark' title='Oracle数据库产品发布计划'>Oracle数据库产品发布计划</a></li>
<li><a href='http://www.dbasoul.com/2008/180.html' rel='bookmark' title='isqlplus desupport on 11g'>isqlplus desupport on 11g</a></li>
<li><a href='http://www.dbasoul.com/2008/123.html' rel='bookmark' title='Oracle Enterprise Manager 10g Grid Control Certification Checker'>Oracle Enterprise Manager 10g Grid Control Certification Checker</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1035.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>安装Oracle Database时，先检查软件介质的完整性</title>
		<link>http://www.dbasoul.com/2011/1034.html</link>
		<comments>http://www.dbasoul.com/2011/1034.html#comments</comments>
		<pubDate>Sat, 13 Aug 2011 03:28:11 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[install]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1034</guid>
		<description><![CDATA[Oracle Database 11.2的安装介质越来越大，动辄6GB，安装介质的下载非常费时，而且Oracle目前不支持用下载工具下载，只能在网页里直接保存。建议使用IE浏览器下载，尝试过Chrome，Firefox下载的软件都有不完整的情况出现。 近期遇到的几个安装时的错误案例，都是跟下载的软件不完整有关的。因此建议在安装前务必先确认安装软件是正确、完整的。 可以在下载界面查看“view digist”，里面有每个文件的MD5，SHA-1校验值，可以用检查工具对下载的软件进行检查。 检查工具可以下载微软的FCIV，地址：Availability and description of the File Checksum Integrity Verifier utility Happy Installation. Related posts: Oracle De-install Utility Oracle Database补丁安装最佳方法
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/707.html' rel='bookmark' title='Oracle De-install Utility'>Oracle De-install Utility</a></li>
<li><a href='http://www.dbasoul.com/2010/698.html' rel='bookmark' title='Oracle Database补丁安装最佳方法'>Oracle Database补丁安装最佳方法</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead">Oracle Database 11.2的安装介质越来越大，动辄6GB，安装介质的下载非常费时，而且Oracle目前不支持用下载工具下载，只能在网页里直接保存。建议使用IE浏览器下载，尝试过Chrome，Firefox下载的软件都有不完整的情况出现。</p>
<p>近期遇到的几个安装时的错误案例，都是跟下载的软件不完整有关的。因此建议在安装前务必先确认安装软件是正确、完整的。</p>
<p>可以在下载界面查看“view digist”，里面有每个文件的MD5，SHA-1校验值，可以用检查工具对下载的软件进行检查。</p>
<p>检查工具可以下载微软的FCIV，地址：<a href="http://support.microsoft.com/kb/841290" target="_blank">Availability and description of the File Checksum Integrity Verifier utility</a></p>
<p>Happy Installation.</p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/707.html' rel='bookmark' title='Oracle De-install Utility'>Oracle De-install Utility</a></li>
<li><a href='http://www.dbasoul.com/2010/698.html' rel='bookmark' title='Oracle Database补丁安装最佳方法'>Oracle Database补丁安装最佳方法</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1034.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>案例分享：Windows XP下，SYS登录报ORA-1031</title>
		<link>http://www.dbasoul.com/2011/1032.html</link>
		<comments>http://www.dbasoul.com/2011/1032.html#comments</comments>
		<pubDate>Fri, 05 Aug 2011 09:51:14 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[reference]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1032</guid>
		<description><![CDATA[Windows XP环境下安装的数据库，使用sys登录时报ora-1031错误。 sqlplus / as sysdba SQL*Plus: Release 10.2.0.1.0 &#8211; Production on M Mar 8 17:22:36 2010 Copyright (c) 1982, 2003, Oracle. All Rights Reserved. ERROR: ORA-1031: Insufficient privileges. 检查sqlnet.ora文件的配置： SQLNET.AUTHENTICATION_SERVICES=(NTS) 重新修改配置如下： SQLNET.AUTHENTICATION_SERVICES=(nts) 关于AUTHENTICATION_SERVICES参数有如下的常规设置： none for no authentication methods, including Microsoft Windows native operating system authentication. When SQLNET.AUTHENTICATION_SERVICES is set to none, a valid user...  <a href="http://www.dbasoul.com/2011/1032.html" class="more-link" title="Read 案例分享：Windows XP下，SYS登录报ORA-1031">Read more &#187;</a>
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/706.html' rel='bookmark' title='11gR2 for Windows Release'>11gR2 for Windows Release</a></li>
<li><a href='http://www.dbasoul.com/2010/710.html' rel='bookmark' title='Cisco VPN Client x64 for Windows 7'>Cisco VPN Client x64 for Windows 7</a></li>
<li><a href='http://www.dbasoul.com/2011/962.html' rel='bookmark' title='Config Oracle 10g VIP on Unix'>Config Oracle 10g VIP on Unix</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead">Windows XP环境下安装的数据库，使用sys登录时报ora-1031错误。</p>
<p>sqlplus / as sysdba</p>
<p>SQL*Plus: Release 10.2.0.1.0 &#8211; Production on M Mar 8 17:22:36 2010</p>
<p>Copyright (c) 1982, 2003, Oracle. All Rights Reserved.</p>
<p>ERROR:<br />
ORA-1031: Insufficient privileges.</p>
<p>检查sqlnet.ora文件的配置：<br />
SQLNET.AUTHENTICATION_SERVICES=(NTS)</p>
<p>重新修改配置如下：<br />
SQLNET.AUTHENTICATION_SERVICES=(nts)</p>
<p>关于AUTHENTICATION_SERVICES参数有如下的常规设置：</p>
<p><strong>none</strong> for no authentication methods, including Microsoft Windows native operating system authentication. When SQLNET.AUTHENTICATION_SERVICES is set to none, a valid user name and password can be used to access the database.</p>
<p><strong>all</strong> for all authentication methods.</p>
<p><strong>nts</strong> for Microsoft Windows native operating system authentication.</p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/706.html' rel='bookmark' title='11gR2 for Windows Release'>11gR2 for Windows Release</a></li>
<li><a href='http://www.dbasoul.com/2010/710.html' rel='bookmark' title='Cisco VPN Client x64 for Windows 7'>Cisco VPN Client x64 for Windows 7</a></li>
<li><a href='http://www.dbasoul.com/2011/962.html' rel='bookmark' title='Config Oracle 10g VIP on Unix'>Config Oracle 10g VIP on Unix</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1032.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Relic Architecture &#8211; Collecting 20+ Billion Metrics A Day</title>
		<link>http://www.dbasoul.com/2011/1030.html</link>
		<comments>http://www.dbasoul.com/2011/1030.html#comments</comments>
		<pubDate>Mon, 18 Jul 2011 23:42:28 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Architecture]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1030</guid>
		<description><![CDATA[This is a guest post by Brian Doll, Application Performance Engineer at New Relic. New Relic’s multitenant, SaaS web application monitoring service collects and persists over 100,000 metrics every second on a sustained basis, while still delivering an average page load time of 1.5 seconds.  We believe that good architecture and good tools can help you handle...  <a href="http://www.dbasoul.com/2011/1030.html" class="more-link" title="Read New Relic Architecture &#8211; Collecting 20+ Billion Metrics A Day">Read more &#187;</a>
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2011/1002.html' rel='bookmark' title='Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App'>Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App</a></li>
<li><a href='http://www.dbasoul.com/2011/1013.html' rel='bookmark' title='37signals Architecture'>37signals Architecture</a></li>
<li><a href='http://www.dbasoul.com/2011/973.html' rel='bookmark' title='Facebook: An Example Canonical Architecture For Scaling Billions Of Messages'>Facebook: An Example Canonical Architecture For Scaling Billions Of Messages</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead"><em>This is a guest post by <a href="mailto:brian@newrelic.com">Brian Doll</a>, Application Performance Engineer at New Relic.</em></p>
<p>New Relic’s multitenant, SaaS web application monitoring service collects and persists over 100,000 metrics every second on a sustained basis, while still delivering an average page load time of 1.5 seconds.  We believe that good architecture and good tools can help you handle an extremely large amount of data while still providing extremely fast service.  Here we&#8217;ll show you how we do it.</p>
<ul>
<li> New Relic is Application Performance Management (APM) as a Service</li>
<li> In-app agent instrumentation (bytecode instrumentation, etc.)</li>
<li> Support for 5 programming languages (Ruby, Java, PHP, .NET, Python)</li>
<li> 175,000+ app processes monitored globally</li>
<li> 10,000+ customers</li>
</ul>
<h2>The Stats</h2>
<ul>
<li> 20+ Billion application metrics collected every day</li>
<li> 1.7+ Billion web page metrics collected every week</li>
<li> Each &#8220;timeslice&#8221; metric is about 250 bytes</li>
<li> 100k timeslice records inserted every second</li>
<li> 7 Billion new rows of data every day</li>
<li> Data collection handled by 9 sharded MySQL servers</li>
</ul>
<h2>Architecture Overview</h2>
<ul>
<li>Language-specific agents (Ruby, Java, PHP. .NET, Python) send application metrics back to New Relic once every minute</li>
<li>The &#8220;collector&#8221; service digests app metrics and persists them in the right MySQL shard</li>
<li>Real User Monitoring javascript snippet sends front-end performance data to the &#8220;beacon&#8221; service for every single page view</li>
<li>Customers log into http://rpm.newrelic.com/ to view their performance dashboard</li>
<li>The amount of data we collect every day is staggering.  Initially all data is captured at full resolution for each metric.  Over time we reperiodize the data, going from minute-by-minute to hourly and then finally to daily averages.  For our professional accounts, we store daily metric data indefinitely, so customers can see how they&#8217;ve improved over the long haul.</li>
<li>Our data strategy is optimized for reading, since our core application is constantly needing to access metric data by time series.  It&#8217;s easier to pay a penalty on write to keep the data optimized for faster reads, to ensure our customers can quickly access their performance data any time of the day. Sharding our database helps by distributing customers across multiple servers.  Within each server we have individual tables per customer to keep the customer data close together on disk and to keep the total number of rows per table down.</li>
<li>New Relic manages several types of alerts for monitoring systems.  Customers can set thresholds on their APDEX score and error rate.  New Relic also has an availability monitoring feature, so customers can get alerted on downtime events as short as 30 seconds.  We send email alerts primarily, with several customers using our PagerDuty.com integration for more complex on-call rotations with SMS integration.</li>
<li>Let&#8217;s take a single web transaction from a customer request all the way through the New Relic stack.
<ol>
<li>An end user views a page on Example.com, who uses New Relic to monitor their app performanc</li>
<li>The application running Example.com is running with New Relic agents installed (for Ruby, Java, PHP, .NET or Python</li>
<li>Detailed performance metrics are captured for each transaction, including time spent in each component, database queries, external API calls, etc</li>
<li>These back-end metrics are persisted in the customer&#8217;s New Relic agent for up to one minute, where they are then sent back to the New Relic data collection service</li>
<li>Meanwhile, embedded in the web page is the New Relic Real-User Monitoring JavaScript code, which tracks the performance of this single customers experience</li>
<li>When the page is fully rendered within the customer&#8217;s browser, the New Relic beacon gets a request providing performance metrics on the back-end, network, DOM processing and page rendering times.</li>
<li>An engineer working on Example.com logs into New Relic and sees up-to-the-minute application performance metrics as well as the end-user experience for every single customer, including browser and geographic information.</li>
</ol>
</li>
</ul>
<h2>Platform</h2>
<h3>Web UI</h3>
<ul>
<li>Ruby on Rails</li>
<li>nginx</li>
<li>Linux</li>
<li>2 @ 12 core Intel Nehalem CPUs w/ 48Gb RAM</li>
</ul>
<div id="_mcePaste"></div>
<h3>Data Collector And Web Beacon Services</h3>
<div id="_mcePaste">
<ul>
<li> Java</li>
<li> Servlets on Jetty</li>
<li> App metrics collector: 180k+ requests per minute, responding in 3ms</li>
<li> Web metrics beacon service: 200k+ requests per minute, responding in 0.15ms</li>
<li> Sharded MySQL using the Percona build</li>
<li> Linux</li>
<li> 9 @ 24 core Intel Nehalem w/ 48GB RAM, SAS attached RAID 5</li>
<li> Bare metal (no virtualization)</li>
</ul>
</div>
<div id="_mcePaste"></div>
<h3>Interesting MySQL Stats:</h3>
<div id="_mcePaste">
<ul>
<li>New Relic creates a database table per account per hour to hold metric data.</li>
<li>This table strategy is optimized for reads vs. writes</li>
<li>Constantly need to render charts based on one or more metrics for a specific account in a specific time window</li>
<li>The primary key for metrics (metric, agent, timestamp) allows data for a particular metric from a particular agent to be located together on disk</li>
<li>Over time this creates more page splits in innodb and I/O ops increase throughout the hour, when a new table is created</li>
<li>New accounts are assigned a specific shard in a round-robin fashion. Since some accounts are larger than others, shards are occasionally pulled out of the assignment queue to more evenly distribute load.</li>
<li>Having so many tables with this amount of data in them makes schema migrations impossible. Instead, &#8220;template&#8221; tables are used from which new timeslice tables are created.  New tables use the new definition while old tables are eventually purged from the system.  The application code needs to be aware that multiple table definitions may be active at one time.</li>
</ul>
</div>
<div id="_mcePaste"></div>
<div id="_mcePaste"></div>
<h2>Challenges</h2>
<div id="_mcePaste"></div>
<div id="_mcePaste">
<ul>
<li>Data purging: Summarization of metrics and purging granular metric data is an expensive and nearly continuous process</li>
<li>Determining what metrics can be pre-aggregated</li>
<li>Large accounts: Some customers have many applications, while others have a staggering number of servers</li>
<li>MySQL optimization and tuning, including the OS and filesystem</li>
<li>I/O performance: bare metal db servers, table-per-account vs. large tables for read performance</li>
<li>Load balancing shards: Big accounts, small accounts, high-utilization accounts</li>
</ul>
</div>
<div id="_mcePaste"></div>
<h2>Lessons Learned</h2>
<div id="_mcePaste"></div>
<div id="_mcePaste">
<ul>
<li>New Relic monitors its own services with New Relic (staging monitors production)</li>
<li>Aim for operational efficiency and simplicity at every turn</li>
<li>Stay lean. New Relic has ~30 engineers supporting 10k customers</li>
<li>Trendy != Reliable: There are lots of essential yet boring aspects to high-performing systems that not all trendy solutions have solved for yet.</li>
<li>Use the right tech for the job. The main New Relic web application has always been a Rails app.  The data collection tier was originally written in Ruby, but was eventually ported over to Java.  The primary driver for this change was performance.  This tier currently supports over 180k requests per minute and responds in around 2.5 milliseconds with plenty of headroom to go.</li>
</ul>
<p><a href="http://highscalability.com/blog/2011/7/18/new-relic-architecture-collecting-20-billion-metrics-a-day.html">http://highscalability.com/blog/2011/7/18/new-relic-architecture-collecting-20-billion-metrics-a-day.html</a></div>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2011/1002.html' rel='bookmark' title='Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App'>Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App</a></li>
<li><a href='http://www.dbasoul.com/2011/1013.html' rel='bookmark' title='37signals Architecture'>37signals Architecture</a></li>
<li><a href='http://www.dbasoul.com/2011/973.html' rel='bookmark' title='Facebook: An Example Canonical Architecture For Scaling Billions Of Messages'>Facebook: An Example Canonical Architecture For Scaling Billions Of Messages</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1030.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Oracle Database文档支持Mobi，ePub格式</title>
		<link>http://www.dbasoul.com/2011/1025.html</link>
		<comments>http://www.dbasoul.com/2011/1025.html#comments</comments>
		<pubDate>Sat, 09 Jul 2011 16:02:48 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[reference]]></category>
		<category><![CDATA[doc]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1025</guid>
		<description><![CDATA[近期在官网看文档时，发现Oracle Database的文档，包括10g，11g都已经在以前html，PDF格式的基础上，又增加了两种新的格式Mobi，ePub，用来支持目前流行的平板设备Kindle和iPad，以后可以更方便的在平板设备上看官方文档了。 详见官方网址：http://www.oracle.com/technetwork/indexes/documentation/index.html Related posts: Oracle Grid Control 11gR1下载 Install Oracle on Linux Oracle blogs
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/711.html' rel='bookmark' title='Oracle Grid Control 11gR1下载'>Oracle Grid Control 11gR1下载</a></li>
<li><a href='http://www.dbasoul.com/2009/524.html' rel='bookmark' title='Install Oracle on Linux'>Install Oracle on Linux</a></li>
<li><a href='http://www.dbasoul.com/2010/733.html' rel='bookmark' title='Oracle blogs'>Oracle blogs</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead">近期在官网看文档时，发现Oracle Database的文档，包括10g，11g都已经在以前html，PDF格式的基础上，又增加了两种新的格式Mobi，ePub，用来支持目前流行的平板设备Kindle和iPad，以后可以更方便的在平板设备上看官方文档了。</p>
<p>详见官方网址：<a href="http://www.oracle.com/technetwork/indexes/documentation/index.html">http://www.oracle.com/technetwork/indexes/documentation/index.html</a></p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2010/711.html' rel='bookmark' title='Oracle Grid Control 11gR1下载'>Oracle Grid Control 11gR1下载</a></li>
<li><a href='http://www.dbasoul.com/2009/524.html' rel='bookmark' title='Install Oracle on Linux'>Install Oracle on Linux</a></li>
<li><a href='http://www.dbasoul.com/2010/733.html' rel='bookmark' title='Oracle blogs'>Oracle blogs</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1025.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MapReduce</title>
		<link>http://www.dbasoul.com/2011/1022.html</link>
		<comments>http://www.dbasoul.com/2011/1022.html#comments</comments>
		<pubDate>Tue, 28 Jun 2011 14:48:24 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[MapReduce]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1022</guid>
		<description><![CDATA[MapReduce是Google提出的一个软件架构，用于大规模数据集（大于1TB）的并行运算。概念&#8221;Map（映射）&#8221;和&#8221;Reduce（化简）&#8221;，和他们的主要思想，都是从函数式编程语言借来的，还有从矢量编程语言借来的特性。[1] 当前的软件实现是指定一个Map（映射）函数，用来把一组键值对映射成一组新的键值对，指定并发的Reduce（化简）函数，用来保证所有映射的键值对中的每一个共享相同的键组。 映射和化简 简单说来，一个映射函数就是对一些独立元素组成的概念上的列表（例如，一个测试成绩的列表）的每一个元素进行指定的操作（比如，有人发现所有学生的成绩都被高估了一分，他可以定义一个“减一”的映射函数，用来修正这个错误。）。事实上，每个元素都是被独立操作的，而原始列表没有被更改，因为这里创建了一个新的列表来保存新的答案。这就是说，Map操作是可以高度并行的，这对高性能要求的应用以及并行计算领域的需求非常有用。 而化简操作指的是对一个列表的元素进行适当的合并（继续看前面的例子，如果有人想知道班级的平均分该怎么做？他可以定义一个化简函数，通过让列表中的奇数(odd)或偶数(even)元素跟自己的相邻的元素相加的方式把列表减半，如此递归运算直到列表只剩下一个元素，然后用这个元素除以人数，就得到了平均分）。虽然他不如映射函数那么并行，但是因为化简总是有一个简单的答案，大规模的运算相对独立，所以化简函数在高度并行环境下也很有用。 分布和可靠性 MapReduce通过把对数据集的大规模操作分发给网络上的每个节点实现可靠性；每个节点会周期性的把完成的工作和状态的更新报告回来。如果一个节点保持沉默超过一个预设的时间间隔，主节点（类同Google档案系统中的主服务器）记录下这个节点状态为死亡，并把分配给这个节点的数据发到别的节点。每个操作使用命名文件的原子操作以确保不会发生并行线程间的冲突；当文件被改名的时候，系统可能会把他们复制到任务名以外的另一个名字上去。（避免副作用）。 化简操作工作方式很类似，但是由于化简操作在并行能力较差，主节点会尽量把化简操作调度在一个节点上，或者离需要操作的数据尽可能近的节点上了；这个特性可以满足Google的需求，因为他们有足够的带宽，他们的内部网络没有那么多的机器。 用途 在Google，MapReduce用在非常广泛的应用程序中，包括“分布grep，分布排序，web连接图反转，每台机器的词矢量，web访问日志分析，反向索引构建，文档聚类，机器学习，基于统计的机器翻译……”值得注意的是，MapReduce实现以后，它被用来重新生成Google的整个索引，并取代老的ad hoc程序去更新索引。 MapReduce会生成大量的临时文件，为了提高效率，它利用Google档案系统来管理和访问这些文件。 2011年，Google发表Snappy压缩函式库开源项目；Snappy是Google MapReduce的一部分，以减少Network I/O 或 Disk I/O提升效能。[1] 其他实现 Hadoop － Apache软件基金会的开放源码项目，提供与MapReduce档案系统类似的功能。http://hadoop.apache.org/ http://zh.wikipedia.org/zh-cn/MapReduce &#160; No related posts.
No related posts.]]></description>
			<content:encoded><![CDATA[<p class="lead"><strong>MapReduce</strong>是<a title="Google" href="http://zh.wikipedia.org/wiki/Google">Google</a>提出的一个软件架构，用于大规模数据集（大于1<a title="Terabyte" href="http://zh.wikipedia.org/wiki/Terabyte">TB</a>）的并行运算。概念&#8221;Map（映射）&#8221;和&#8221;Reduce（化简）&#8221;，和他们的主要思想，都是从<a title="函数式编程语言" href="http://zh.wikipedia.org/wiki/%E5%87%BD%E6%95%B0%E5%BC%8F%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80">函数式编程语言</a>借来的，还有从<a title="矢量编程语言（尚未撰写）" href="http://zh.wikipedia.org/w/index.php?title=%E7%9F%A2%E9%87%8F%E7%BC%96%E7%A8%8B%E8%AF%AD%E8%A8%80&amp;action=edit&amp;redlink=1">矢量编程语言</a>借来的特性。<sup><a rel="nofollow" href="http://zh.wikipedia.org/wiki/MapReduce#endnote_map">[1]</a></sup></p>
<p>当前的软件实现是指定一个<em>Map（映射）</em>函数，用来把一组键值对映射成一组新的键值对，指定并发的<em>Reduce（化简）</em>函数，用来保证所有映射的键值对中的每一个共享相同的键组。</p>
<h2>映射和化简</h2>
<p>简单说来，一个映射函数就是对一些独立元素组成的概念上的列表（例如，一个测试成绩的列表）的每一个元素进行指定的操作（比如，有人发现所有学生的成绩都被高估了一分，他可以定义一个“减一”的映射函数，用来修正这个错误。）。事实上，每个元素都是被独立操作的，而原始列表没有被更改，因为这里创建了一个新的列表来保存新的答案。这就是说，Map操作是可以高度并行的，这对高性能要求的应用以及<a title="并行计算" href="http://zh.wikipedia.org/wiki/%E5%B9%B6%E8%A1%8C%E8%AE%A1%E7%AE%97">并行计算</a>领域的需求非常有用。</p>
<p>而化简操作指的是对一个列表的元素进行适当的合并（继续看前面的例子，如果有人想知道班级的平均分该怎么做？他可以定义一个化简函数，通过让列表中的奇数(odd)或偶数(even)元素跟自己的相邻的元素相加的方式把列表减半，如此递归运算直到列表只剩下一个元素，然后用这个元素除以人数，就得到了平均分）。虽然他不如映射函数那么并行，但是因为化简总是有一个简单的答案，大规模的运算相对独立，所以化简函数在高度并行环境下也很有用。</p>
<h2>分布和可靠性</h2>
<p>MapReduce通过把对数据集的大规模操作分发给网络上的每个节点实现可靠性；每个节点会周期性的把完成的工作和状态的更新报告回来。如果一个节点保持沉默超过一个预设的时间间隔，主节点（类同<a title="Google档案系统" href="http://zh.wikipedia.org/wiki/Google%E6%AA%94%E6%A1%88%E7%B3%BB%E7%B5%B1">Google档案系统</a>中的主服务器）记录下这个节点状态为死亡，并把分配给这个节点的数据发到别的节点。每个操作使用命名文件的原子操作以确保不会发生并行线程间的冲突；当文件被改名的时候，系统可能会把他们复制到任务名以外的另一个名字上去。（避免<a title="副作用(计算机科学)（尚未撰写）" href="http://zh.wikipedia.org/w/index.php?title=%E5%89%AF%E4%BD%9C%E7%94%A8(%E8%AE%A1%E7%AE%97%E6%9C%BA%E7%A7%91%E5%AD%A6)&amp;action=edit&amp;redlink=1">副作用</a>）。</p>
<p>化简操作工作方式很类似，但是由于化简操作在并行能力较差，主节点会尽量把化简操作调度在一个节点上，或者离需要操作的数据尽可能近的节点上了；这个特性可以满足Google的需求，因为他们有足够的带宽，他们的内部网络没有那么多的机器。</p>
<h2>用途</h2>
<p>在Google，MapReduce用在非常广泛的应用程序中，包括“分布<a title="Grep" href="http://zh.wikipedia.org/wiki/Grep">grep</a>，分布排序，web连接图反转，每台机器的词矢量，web访问日志分析，反向索引构建，文档聚类，<a title="机器学习" href="http://zh.wikipedia.org/wiki/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0">机器学习</a>，基于统计的机器翻译……”值得注意的是，MapReduce实现以后，它被用来重新生成Google的整个索引，并取代老的ad hoc程序去更新索引。</p>
<p>MapReduce会生成大量的临时文件，为了提高效率，它利用<a title="Google档案系统" href="http://zh.wikipedia.org/wiki/Google%E6%AA%94%E6%A1%88%E7%B3%BB%E7%B5%B1">Google档案系统</a>来管理和访问这些文件。</p>
<p>2011年，Google发表Snappy<a title="资料压缩" href="http://zh.wikipedia.org/wiki/%E8%B3%87%E6%96%99%E5%A3%93%E7%B8%AE">压缩</a><a title="函式库" href="http://zh.wikipedia.org/wiki/%E5%87%BD%E5%BC%8F%E5%BA%AB">函式库</a><a title="开源" href="http://zh.wikipedia.org/wiki/%E5%BC%80%E6%BA%90">开源</a>项目；Snappy是Google MapReduce的一部分，以减少Network I/O 或 Disk I/O提升效能。<sup id="cite_ref-0"><a href="http://zh.wikipedia.org/zh-cn/MapReduce#cite_note-0">[1]</a></sup></p>
<h2>其他实现</h2>
<ul>
<li><a title="Hadoop" href="http://zh.wikipedia.org/wiki/Hadoop">Hadoop</a> － <a title="Apache软件基金会" href="http://zh.wikipedia.org/wiki/Apache%E8%BB%9F%E4%BB%B6%E5%9F%BA%E9%87%91%E6%9C%83">Apache软件基金会</a>的<a title="开放源码" href="http://zh.wikipedia.org/wiki/%E9%96%8B%E6%94%BE%E6%BA%90%E7%A2%BC">开放源码</a>项目，提供与MapReduce档案系统类似的功能。<a rel="nofollow" href="http://hadoop.apache.org/">http://hadoop.apache.org/</a></li>
</ul>
<p><a href="http://zh.wikipedia.org/zh-cn/MapReduce">http://zh.wikipedia.org/zh-cn/MapReduce</a></p>
<p>&nbsp;</p>
<p>No related posts.</p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1022.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Medialets Architecture &#8211; Defeating The Daunting Mobile Device Data Deluge</title>
		<link>http://www.dbasoul.com/2011/1021.html</link>
		<comments>http://www.dbasoul.com/2011/1021.html#comments</comments>
		<pubDate>Tue, 28 Jun 2011 07:16:20 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Architecture]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1021</guid>
		<description><![CDATA[Mobile developers have a huge scaling problem ahead: doing something useful with massive continuous streams of telemetry data from millions and millions of devices. This is a really good problem to have. It means smartphone sales are finally fulfilling their destiny: slaughtering PCs in the sales arena. And it also means mobile devices aren&#8217;t just containers for...  <a href="http://www.dbasoul.com/2011/1021.html" class="more-link" title="Read Medialets Architecture &#8211; Defeating The Daunting Mobile Device Data Deluge">Read more &#187;</a>
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2011/977.html' rel='bookmark' title='TripAdvisor Architecture &#8211; 40M Visitors, 200M Dynamic Page Views, 30TB Data'>TripAdvisor Architecture &#8211; 40M Visitors, 200M Dynamic Page Views, 30TB Data</a></li>
<li><a href='http://www.dbasoul.com/2008/35.html' rel='bookmark' title='Oracle Metalink Reference &#8211; Data guard'>Oracle Metalink Reference &#8211; Data guard</a></li>
<li><a href='http://www.dbasoul.com/2010/731.html' rel='bookmark' title='Data Recovery Advisor – oracle 11g new feature'>Data Recovery Advisor – oracle 11g new feature</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead">Mobile developers have a huge scaling problem ahead: doing something useful with massive continuous streams of telemetry data from millions and millions of devices. This is a really good problem to have. It means smartphone sales are finally fulfilling their destiny: <a href="http://www.mobiledia.com/news/81680.html">slaughtering PCs</a> in the sales arena. And it also means mobile devices aren&#8217;t just containers for simple standalone apps anymore, they are becoming the dominant interface to giant backend systems.</p>
<p>While developers are now rocking mobile development on the client side, their next challenge is how to code those tricky backend bits. A company facing those same exact problems right now is<a href="http://www.medialets.com/">Medialets</a>, a mobile rich media ad platform. What they do is help publishers create high quality interactive ads, though for our purposes their ad stuff isn&#8217;t that interesting. What I did find really interesting about their system is how they are tackling the problem of defeating the mobile device data deluge.</p>
<p>Each day Medialets munches on billions of new objects embedded in a stream of terabytes of raw event data flowing in from millions of mobile devices. All that data must be: generated on the mobile device; transmitted over lossy connections punctuated by long periods of disconnection; crunched; made available to reporting systems; fed back into control systems that must be able to respond within milliseconds to requests.</p>
<p>This will become a common paradigm for systems featuring mobile devices. The question is, how you can make it happen?</p>
<p>Now that&#8217;s interesting.</p>
<p>To help us understand more about how Medialets works, Joe Stein, Engineering Manager for Server Platforms at Medialets, was kind enough to talk to me about what they are doing. Joe also runs an excellent Hadoop podcast and blog called <a href="http://allthingshadoop.com/">All Things Hadoop</a>, check it out.</p>
<p>Joe has worked on this problem a lot and has some great ideas on how to build an effective mobile data cruncher using tools like Hadoop, MySQL, HBase, Cassandra, Ruby, Python, and Java&#8230;</p>
<h2>Site</h2>
<ul>
<li><a href="http://www.medialets.com/">http://www.medialets.com</a> &#8211; home page.</li>
<li><a href="http://www.medialytics.com/">http://www.medialytics.com</a> &#8211; insights dashboard for analyzing events from applications.</li>
</ul>
<h2>What Is Medialets?</h2>
<p>Medialets delivers rich media advertising to mobile devices like the iPhone, iPad, and Android. Rich media means advertisements can be complex applications that embed, using a SDK, event generation and advertising functionality. The idea is that ads run inside the platform and instead of being lame adsense type ads, these can be fully interactive while providing the same brand quality they have on TV, except it&#8217;s on the mobile device. Applications can have you do things like shake the device or play a little football game with Michael Strahan. All this activity generates data that must be streamed back to their server farm for processing.</p>
<p>In addition to advertising they also provide very elaborate <a href="http://www.medialytics.com/">app based analytics</a> for their publishers.</p>
<p>To see examples of their ads <a href="http://www.medialets.com/showcase">go here</a>.</p>
<h2>The Stats</h2>
<ul>
<li>2-3 TB of new data every day (uncompressed).</li>
<li>Tens of billions of analytic events created per day. An event means somebody shook, turned off, rotated, etc. the app.</li>
<li>200+ Premium Applications run on tens of millions of mobile devices (iPhone, iPad, and Android).</li>
<li>Over 700 billion analytics events already processed.</li>
<li>During typical MapReduce jobs the can crunch on over 1 million events per second</li>
<li>Data is often available in <a href="http://www.medialytics.com/" target="_blank">www.medialytics.com</a> within 1 hour of coming in from mobile devices</li>
<li>Approximately 100 servers total.</li>
<li>Mobile is growing. Smartphone <a href="http://www.mobiledia.com/news/81680.html">sales have outpaced PC sales</a> for the first time and Medialets is seeing that in their growth. iPhone, iPad and Android <a href="http://www.medialets.com/medialets-data-spotlight-mobile-rich-media-momentum-q4-2011/">devices increased nearly</a> 300% in fourth quarter 2010. Android continues to grow market share, but iOS still dominates for premium mobile inventory.</li>
<li>A dozen people are in engineering, mostly on the client side and Medialytics. The infrastructure team is 1-2 people.</li>
</ul>
<h2>Infrastructure</h2>
<ul>
<li>Ad Servers instances running on quad core x 16GB w/ 1TB</li>
<li>Event Tracking Servers  on 16 core x 12 GB w/ 10TB</li>
<li>Event Processing, Job Execution, Log Collection &amp; Aggregate Pre-Processing each 16 core 24GB RAM w/ 6TB of space</li>
<li>Log Collection &amp; Aggregate Pre-Processing and Post-Processing. Hadoop Cluster nodes each 16 core x 12GB RAM w/ 2x2TB (JBOD).</li>
<li><a href="http://www.medialytics.com/" target="_blank">www.medialytics.com</a> varying configurations all with 16 cores from 12 to 96GB of RAM w/ 1-2TB</li>
</ul>
<h2>Open Source Systems &amp; Tools In Production</h2>
<ul>
<li>Linux</li>
<li>Haproxy</li>
<li>Tomcat</li>
<li>MySQL</li>
<li>Apache</li>
<li>Passenger</li>
<li>RoR</li>
<li>Memcached</li>
<li>Gearman</li>
<li>Hadoop</li>
<li>Pig</li>
<li>Nagios</li>
<li>Ganglia</li>
<li>Git</li>
<li>Jira</li>
</ul>
<h2>Languages</h2>
<ul>
<li>C/C++ &#8211; ad server</li>
<li>Java &#8211; data transformation</li>
<li>Ruby &#8211; a lot server side Ruby. Probably more Ruby than C++, but moving more to Python.</li>
<li>Python &#8211; a lot of MapReduce is being moved into Python using <a href="http://allthingshadoop.com/2010/12/16/simple-hadoop-streaming-tutorial-using-joins-and-keys-with-python/">Python streaming</a>.</li>
<li>Scala</li>
<li>Bash</li>
</ul>
<h2>Architecture</h2>
<ul>
<li>The system was built over a couple years, mostly out of custom software, though they use Hadoop for the heavy lifting on the analytics side and MySQL as the database. Custom software was required to scale when they first started, but they are now considering more off-the-shelf products as these have evolved.</li>
<li>The system is real-time in the sense that as data trickles in it&#8217;s made available as fast possible and as real as possible in reports.</li>
<li>Not cloud based. They run on physical hardware. They couldn&#8217;t do what they need to do in a multitenancy environment in the cloud. Machines with 16 cores and terabytes of disk space is almost not enough.  They spend a lot of  time building their software to take advantage of the hardware, disk IO, CPU, parallel processing, utilizing as much memory as they can.</li>
<li>Everything (almost) they do is asynchronous. You don&#8217;t need to be connected to a network to see an ad or for them to capture the analytics. Ads can be flighted (scheduled) weeks before a campaign kicks off and you can be on a subway and you&#8217;ll still be able to see the ad. Data collected during the ad is eventually passed to the server side.</li>
<li>Publishers are responsible for building the app and integrate in the Medialets SDK. The SDK is specifically for analytics and advertisements. When they want run analytics or run ads in ad slots they use Medialets&#8217; tools.</li>
<li>There are three basic subsytems: ad serving, event processing, and reporting.</li>
<li>Servers break down into a few tiers:
<ul>
<li>Forward facing tiers:
<ul>
<li>Ad Serving Tier &#8211; designed just to run ad servers.</li>
<li>Tracking Tier &#8211; handles data dumps and data loads.</li>
</ul>
</li>
<li>Asynchronous processing tiers:
<ul>
<li>Hadoop Cluster &#8211;  runs on its own set of servers.</li>
<li>Java and Ruby Processes &#8211; take data that comes in and turns the data into a form usable by Hadoop to get the different aggregations going.</li>
</ul>
</li>
</ul>
</li>
<li>Keep things as lean and mean as possible with only as much redundancy as needed.
<ul>
<li>Hardware is configured based on what the software is doing. Not all the machines are high end or low end. Different parts of the system have different hardware needs.</li>
<li>When responding to ad service requests there&#8217;s very little disk IO or computation. There&#8217;s a C++ service doing a lookup on static memory. So the ad serving machines have a lot of memory and lower end processors.</li>
<li>Where they have data intake they have a lot of sockets blocking and writing to disk, so they need very little memory.</li>
<li>The Hadoop MapReduce tier needs everything you can throw at it.</li>
</ul>
</li>
<li>The event handling flow goes something like:
<ul>
<li>Applications on mobile devices generate events. There are application events, ad events, and custom events. Custom events can be created by applications as key-value pairs and they&#8217;ll aggregate on that with no custom coding required.</li>
<li>Tracking servers receive events and write them into a log file of objects. The files represent a snap shot of time over which event data is collected, say seven minutes, for example.</li>
<li>Java servers read these files, unpack them, and run the objects through a series of thread pools so they are transformed and merged with all the necessary meta-data. Different processes pick up the different event types and process them.</li>
<li>The previous step creates a new file where the events are now complete after having been merged with meta-data. That new file is pushed into HDFS (the file system used by Hadoop).</li>
<li>MapReduce is run on these HDFS files to dedupe the data. After the deduplication occurs the data set is clean.</li>
<li>MapReduce jobs produce data that is then imported into databases that the analytics UI makes available to customers. Dozens of different types of aggregations are run. Standard metrics are provided, via Medialytics, showing what an app is doing from from both an ad and app reporting perspective. They aggregate along many different dimensions, for example, by device and by platform. Some of this data is feedback to the ad serving systems.</li>
</ul>
</li>
<li>Data duplication is a key design point on mobile devices and asynchronous systems.
<ul>
<li>A phone OS doesn&#8217;t guarantee an application it will get time to generate events. In the OnClose hook the publisher will run some cleanup logic, Medialets has some cleanup logic, then the event is written to a server which takes a couple milliseconds to respond, and then the local app has to update the local database. Even though this is a quick trip you are in this 15 millisecond window where the OS doesn&#8217;t guarantee all the functionality will execute. The OS could kill the process or there&#8217;s a crash. This will lead to duplicate events on replay depending on where the failure occurs. The iPhone&#8217;s new background feature complicates the accounting. If an app is backgrounded for 30 seconds or less it&#8217;s still considered the same run. There are a lot of variations.</li>
<li>Data can still come in from a phone that&#8217;s been off for 3 months. They have to be able know if they&#8217;ve seen that data before, a common occurrence if apps aren&#8217;t used all the time.</li>
<li>Hadoop is used to calculate metrics and aggregation, but it is also used for deduplication. When processing data they look at over a million events per second in order to dedupe the data. Everyday they get 10 billion events. They have to look at every event and decide if 1) they have seen the event before; 2) are they going to see the event again in the data set they are processing. Using MapReduce that data is broken up across a bunch of servers, which means you don&#8217;t really know until you&#8217;ve reduced everything that any of the data was duplicated.</li>
</ul>
</li>
<li>Advertising
<ul>
<li>Some types of event data are streamed back in real-time. For another class of data the event has to be opened and closed before an event is created. To know how many times someone used an app in a day, for example, there must be an open and a corresponding close event.</li>
<li>Users in the mobile world are truly unique. In the Internet world users, unless they have logged in, are almost impossible to quantify. A computer is used by many people and you can&#8217;t really identify who is using a device. A physical device like the phone is usually used by a single person, so aggregating on a device basis is essentially aggregating on a user basis.</li>
<li>They can gather stats like conversions along different dimensions. They can tell how long it takes someone to upgrade from version of the OS to another. Who are the different types of people? Then they can overlay that to the different types of apps that they have.</li>
<li>Ads are either brand or direct response advertisements, there&#8217;s a mix. A movie ad, for example, when clicked on instead of going to a website could bring up a local application from the device. That allows you to buy the tickets right in the app. Having different interceptors and the capability of creating a rich user experience makes it possible to monetize interaction flows in new ways.</li>
<li>When an app is launched an ad should be displayed immediately. Async is used for pushing data to the servers, but ads can be served synchronously. A lot of apps have a sponsor logo you see when the app opens and that must be served immediately. The first onload call is a sync call to get the ads. Their ad serving system can serve ads in under 200 milliseconds with a 3% variance.</li>
</ul>
</li>
<li>70%-80% of their storage costs are saved because they store data in <a href="http://blog.mgm-tp.com/2010/04/hadoop-log-management-part2/">compressed sequence files</a>.
<ul>
<li>HDFS has a sequence file format. Compression can be either on a row or a block basis. Let&#8217;s say you have a sequence file with 10 blocks in it and a block is defined as data that will go into your mapper (for MapReduce). If compression is on a block level and there are 10 blocks then those 10 blocks can be pushed to mappers in parallel and HDFS will handle all the decompression and streaming automatically.</li>
<li>Data that comes out of the reducer can be stored in the sequence file, say it&#8217;s the result of the deduplication process, or it can be stored in an uncompressed format that&#8217;s ready to load in another database.</li>
<li>A lot of people leave the data uncompressed. To use Hive the data has to be uncompressed so you can interact with it. It&#8217;s a matter of how much do you want to spend and where do you want the inefficiencies of your system to come into play.</li>
<li>It&#8217;s a con to have a lot of data sitting on disk that isn&#8217;t being interacting with in an uncompressed format. They selectively decide which format to use depending on if the decompression phase is worth the overhead compared to how long they keep that data on disk.</li>
</ul>
</li>
<li>Job Execution System
<ul>
<li>Built in Ruby, before suitable off-the-shelf systems were available.</li>
<li>They built a job processing system to implement workflow processing. A workflow is a set of jobs that have different tasks and steps and operate on different events. App data must be processed in many different ways and the results have to go into a few different systems, some into tables, and some into other reporting systems. All that is automated and scripted.</li>
</ul>
</li>
<li>Aggregated data is stored in MySQL for viewing by publishers and advertisers. They are reaching a limit of how much they can shard. They are looking at MongoDB and GridFs (which is part of MongoDB).
<ul>
<li>GridFs is looking good to store, scale, and serve their media files, which is making them consider using MondoDB to store aggregate results sets.</li>
<li>They are also looking at Cassandra and HBase to store their aggregate results sets. They would consider using the same infrastructure also for their tracking and event capture servers, which currently is completely custom written.</li>
<li>Cassandra looks attractive because it works across multiple datacenters. They would use this feature to have multiple clusters in the same dacenter and have writes occur on one cluster and reads in another, so the different traffic loads won&#8217;t step on each other. They don&#8217;t want to mix different types of traffic, so they don&#8217;t want to do MapReduce jobs, writing from HBase, and reading from HBase all on the same machines.</li>
<li>HBase is an attractive option because they already write so much data to HDFS that having those files available in HBase would be exciting. They&#8217;ve had reliability concerns over fsync, making sure data is written to disk, but those concerns have been addressed in recent releases. HBase doesn&#8217;t allow partitioning the data by different uses, which is what is attractive about Cassandra as Cassandra supports having different kinds of racks within the cluster.</li>
<li>Since they are already using HDFS moving all the data into Cassandra after it is processed isn&#8217;t attractive, that would double their hard drive requirements.</li>
<li>They like the idea of the <a href="http://highscalability.com/blog/2010/11/1/hot-trend-move-behavior-to-data-for-a-new-interactive-applic.html">coprocessors</a> so they don&#8217;t have to move the data across the network. Each job is 2-3 terabytes so to truly parallelize that without moving data is very attractive.</li>
<li><a href="http://www.quora.com/Which-NoSQL-stores-have-good-support-for-LRU-or-TTL-semantics-in-storage">TTL deletes</a> in Cassandra are very attractive. Cassandra can easily handle their write load, so it can be used to store incoming events. Then all the work processes can take the mobile data out of Cassandra, merge it with meta-data from other databases, to make for a fully joined objects, then that could be written to HBase, then they could do MapReduce aggregations and write the results to MongoDB.</li>
<li>An alternative dedupe design would be to write it all to HBase and just pick the last one as the winner. Once these systems are in-place they&#8217;ll rethink some of their existing processes to see how they could take advantage.</li>
<li>A lot of prototyping trying was undertaken to figure out if they should keep with their existing software, move to another database, or stay with MySQL. They may end up with MongoDB, Cassandra, and HBase, they just want to find the right mix of capabilities for the new products they are building, and figure out how they can continue to scale without soaking up a lot of developer time.</li>
</ul>
</li>
<li>AdServers are written in C++
<ul>
<li>This layer provides a standard ad scheduling facility so ads can be mapped to slots, rotated, targeted, etc. Targeting can be based on platform, resolution, geography and other dimensions.</li>
<li>There&#8217;s an object cache for database data that is used to make ad delivery decisions.</li>
<li>99% of the time the cache is sufficient, 1% of the time they have to hit the database.</li>
<li>Reads are in a few microseconds, but increase to 200 microseconds when they have to hit the database.</li>
<li>Also responsible for pacing ads so a campaign can be delivered across the lifetime of an app. If an app is run a million times a day, for example, and the ad campaign is for a million impressions, they don&#8217;t want to use it up in one day. Let&#8217;s say the advertiser wants the campaign to run for a month. The ad server looks at the analytic data and the rate requests are coming in to calculate what pace ads should be delivered.</li>
<li>A lot of decisions are precomputed. A human will slot target, saying where an add should be displayed.</li>
<li>Decisions like geographical distribution are calculated on the fly. If you have a set of ad impressions to give out you&#8217;ll want some to go to Canada and some to go the US, for example.</li>
</ul>
</li>
<li>Java Servers
<ul>
<li>Join meta-data with the data sets that come in. Meta-data is pulled from cache at 95% hit rate. When they have a new advertisement that went live, for example, they&#8217;ll hit the database.</li>
<li>Going to the database they want to be as optimistic as possible and interrupt as few as threads as possible, so they use atomic variables and CAS (compare and set) when doing very heavy read operations and very few writes. This switch increased performance 15%-20% because they are no longer blocking on writes.</li>
<li>For the amount of reads they were doing they benchmarked it and found a Mutex took to long. Semaphores ended up blocking on the writer. Say there were 10 threads and 9 could read so no threads would block, but the 10th thread had to write which would block all the threads. This increased latency and didn&#8217;t perform well compared to looping and doing the compare and set. It&#8217;s possible because they are continually processing gigabytes of data that inside the JVM something was being blocked.</li>
<li>Used <a href="http://javathink.blogspot.com/2008/09/what-is-memoizer-and-why-should-you.html">concurrent memoizer pattern</a> to create thread pools that handle cache requests. The cache is a pool that will load data when required. The load uses CAS to block the actual reads that are occurring.</li>
<li><a href="http://www.eecs.harvard.edu/~mdw/proj/seda/">SEDA</a> is used to process data through all it&#8217;s different transformations. Each thread pool performs a state transformation on a chunk of data and then the data is forwarded onto another thread pool for the next transformation. For example, the first stage is to read the data off disk and serializing it onto an object array. These operations are not latency sensitive.</li>
</ul>
</li>
<li>Using Ruby
<ul>
<li>When using Ruby one must fork to really multiprocess functions effectively.</li>
<li><a href="http://en.wikipedia.org/wiki/Rinda_(Ruby_programming_language)">Rinda</a> is used to create concurrencies across forked processes. Sometimes a database is used for coordination.</li>
<li>This hides any memory leak problems or green thread problems common to interpreters.</li>
</ul>
</li>
<li>Monitoring
<ul>
<li>Mix of internal tools and Nagios.</li>
<li>Do a lot of trending of their own logs with Ganglia across all their different tiers.</li>
<li>They take a very proactive monitoring approach and spend a lot of R&amp;D effort so they can know they have problems before they occur.</li>
<li>Trending feeds into their monitoring. If one of their ad servers stops responding within 10 milliseconds, for more than 1 or 2 requests, within a second, they need to know about that.</li>
<li>If request latencies go up from 200 milliseconds to 800 milliseconds on average they want to know.</li>
<li>They log a lot so debugging of problems occurs through the logs.</li>
</ul>
</li>
</ul>
<h2>Lessons Learned</h2>
<ul>
<li><strong><strong>Turn Data into Products</strong>. </strong>Development knows the data they have. That knowledge can be used to help the product team create new products to sell to customers. There&#8217;s always a big gap between R&amp;D and the business. Help the business be in tune with what R&amp;D is doing. Have them understand the power of Hadoop, the data they crunch and how fast they can crunch it. A good example is a new Conversion Attribution product. If a user sees a static ad, clicks on it and downloads the app, it&#8217;s likely that static ad wasn&#8217;t the sole reason for the conversion. If, for example, the user experienced a rich media ad the day before (or two weeks earlier) then that ad would be apportioned some credit for the conversion, based on publisher configurable criteria. This kind of capability is only possible with a robust data processing infrastructure. Without R&amp;D making it known that this sort of feature is even possible, it&#8217;s not likely these new kinds  of high value-add products could be developed and sold to customers.</li>
<li><strong>Explore New Tools</strong>. It&#8217;s a complex world of new tools. All these new technologies make it challenging to know where to put functionality. A feature can be done in so many different ways and would be done differently depending on the tool (HBase, Cassandra, MongoDB). Should the dedupe still be done in MapReduce or should it be done using coprocessors? Is it worth doubling disk by supporting two different databases? Is partitioning data by usage pattern really or win or can all the traffic patterns work on the same system? Prototype your options and think about how your architecture could change with each new tool, working alone or working together.</li>
<li><strong>Monitor and Capacity Plan Proactively</strong>. Turn monitoring data around into planning for infrastructure. If you don&#8217;t do that you&#8217;ll just keep firefighting issues. Build your proactive alerts and trending data so even before it becomes a warning you can see it coming. Sometimes you just need another server. More data simply means more servers. It&#8217;s a cost of doing business.  Knowing what server to put where in order to keep all the different dataflows going appropriately, that&#8217;s the trick. It&#8217;s important to compare the trends of CPU and load and to really look at the infrastructure as a whole and create a plan based on that.</li>
<li><strong>Look at Data from a Product Operations Perspective</strong>. Look at new applications and see how they are similar to what has been implemented before as a way to figure out how you need to scale. Just because a new app comes on board doesn&#8217;t mean need new ad servers, new data nodes, and new tracking servers need to be added. It depends. A lot of different ads make a small amount of ad requests but send a ridiculous amount of data. Trend logs to look at spikes in the data and see how latency is related to the spikes. This tells you where and how different parts of the system need to scale .</li>
</ul>
<h2>Related Articles</h2>
<ul>
<li>Joe Stein&#8217;s <a href="http://www.twitter.com/allthingshadoop">AllThingsHadoop Twitter Feed</a></li>
<li><a href="http://www.medialets.com/tackling-big-data-problems-at-scale/">Tackling Big Data Problems at Scale</a> by Joe Stein</li>
<li>Published Q4 2010 mobile figures <a href="http://www.medialets.com/medialets-data-spotlight-mobile-rich-media-momentum-q4-2011/" target="_blank">http://www.medialets.com/medialets-data-spotlight-mobile-rich-media-momentum-q4-2011/</a></li>
<li><a href="http://allthingshadoop.com/2010/05/17/hadoop-bigdata-cassandra-a-talk-with-jonathan-ellis/">Hadoop, BigData and Cassandra with Jonathan Ellis</a> &#8211; HBase is to OLAP as Cassandra is to OLTP</li>
<li><a href="http://highscalability.com/blog/2010/2/19/twitters-plan-to-analyze-100-billion-tweets.html">Twitter’s Plan To Analyze 100 Billion Tweets</a></li>
</ul>
<p><a href="http://highscalability.com/blog/2011/3/8/medialets-architecture-defeating-the-daunting-mobile-device.html">http://highscalability.com/blog/2011/3/8/medialets-architecture-defeating-the-daunting-mobile-device.html</a></p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2011/977.html' rel='bookmark' title='TripAdvisor Architecture &#8211; 40M Visitors, 200M Dynamic Page Views, 30TB Data'>TripAdvisor Architecture &#8211; 40M Visitors, 200M Dynamic Page Views, 30TB Data</a></li>
<li><a href='http://www.dbasoul.com/2008/35.html' rel='bookmark' title='Oracle Metalink Reference &#8211; Data guard'>Oracle Metalink Reference &#8211; Data guard</a></li>
<li><a href='http://www.dbasoul.com/2010/731.html' rel='bookmark' title='Data Recovery Advisor – oracle 11g new feature'>Data Recovery Advisor – oracle 11g new feature</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1021.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mollom Architecture &#8211; Killing Over 373 Million Spams At 100 Requests Per Second</title>
		<link>http://www.dbasoul.com/2011/1020.html</link>
		<comments>http://www.dbasoul.com/2011/1020.html#comments</comments>
		<pubDate>Tue, 28 Jun 2011 07:13:14 +0000</pubDate>
		<dc:creator>Eric</dc:creator>
				<category><![CDATA[Architecture]]></category>

		<guid isPermaLink="false">http://www.dbasoul.com/?p=1020</guid>
		<description><![CDATA[Mollom is one of those cool SaaS companies every developer dreams of creating when they wrack their brains looking for a viable software-as-a-service startup. Mollom profitably runs a useful service—spam filtering—with a small group of geographically distributed developers. Mollom helps protect nearly 40,000 websites from spam, including one of mine, which is where I first learned...  <a href="http://www.dbasoul.com/2011/1020.html" class="more-link" title="Read Mollom Architecture &#8211; Killing Over 373 Million Spams At 100 Requests Per Second">Read more &#187;</a>
Related posts:<ol>
<li><a href='http://www.dbasoul.com/2011/999.html' rel='bookmark' title='Friends For Sale Architecture &#8211; A 300 Million Page View/Month Facebook RoR App'>Friends For Sale Architecture &#8211; A 300 Million Page View/Month Facebook RoR App</a></li>
<li><a href='http://www.dbasoul.com/2011/1018.html' rel='bookmark' title='Playfish&#8217;s Social Gaming Architecture &#8211; 50 Million Monthly Users And Growing'>Playfish&#8217;s Social Gaming Architecture &#8211; 50 Million Monthly Users And Growing</a></li>
<li><a href='http://www.dbasoul.com/2011/995.html' rel='bookmark' title='Mailinator Architecture'>Mailinator Architecture</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p class="lead"><a href="http://mollom.com/">Mollom</a> is one of those cool SaaS companies every developer dreams of creating when they wrack their brains looking for a viable software-as-a-service startup. Mollom profitably runs a useful service—<a href="http://www.youtube.com/watch?v=anwy2MPT5RE">spam filtering</a>—with a small group of geographically distributed developers. Mollom helps protect nearly 40,000 websites from spam, including <a href="http://biztaxtalk.com/">one of mine</a>, which is where I first learned about Mollom. In a desperate attempt to stop spam on a Drupal site, where every other form of CAPTCHA had failed miserably, I installed Mollom in about 10 minutes and it immediately started working. That&#8217;s the out of the box experience I was looking for.</p>
<p>From the time Mollom opened its digital inspection system they&#8217;ve rejected over 373 million spams and in the process they&#8217;ve learned that a stunning 90% of all messages are spam. This spam torrent is handled by only two geographically distributed machines that handle 100 requests/ second, each running a Java application server and Cassandra. So few resources are necessary because they&#8217;ve created a very efficient machine learning system. Isn&#8217;t that cool? So, how do they do it?</p>
<p>To find out I interviewed Benjamin Schrauwen, cofounder of Mollom, and Johan Vos, Glassfish and Java enterprise expert. Proving software knows no national boundaries, Mollom HQ is located in<a href="http://en.wikipedia.org/wiki/Belgium">Belgium</a> (other good things from Belgium: <a href="http://en.wikipedia.org/wiki/Hercule_Poirot">Hercule Poirot</a>, <a href="http://www.google.com/images?q=belgian+chocolate">chocolate</a>, <a href="http://www.google.com/images?q=belgium+waffles">waffles</a>).</p>
<h2>Statistics</h2>
<ul>
<li>Serving 40,000 active websites, many of which are very large customers like Sony Music, Warner Brothers, Fox News, and The Economist. A lot of big brands, with big websites, and a lot of comments.</li>
<li>Find 1/2 million spam messages each day.</li>
<li>Handle 100 API calls/second.</li>
<li>A spam check is low latency, taking between 30-50msecs. The slowest connection would be 500msec. The 95th percentile of latency is 250msecs. It&#8217;s really optimized for speed.</li>
<li>Spam classification efficiency is at 99.95%. This means that only 5 in 10,000 spam messages were not caught by Mollom.</li>
<li><a href="http://mollom.com/blog/netlog-using-mollom">Netlog</a>, which is a social networking site in Europe, has their own Mollom setup in their own datacenter. Netlog handles about 4 million messages a day on custom <a href="http://en.wikipedia.org/wiki/Learning_classifier_system">classifiers</a> that are trained on their data.</li>
</ul>
<h2>Platform</h2>
<ul>
<li>Two production servers run in two different datacenters for failover.
<ul>
<li>One server is on the East coast and one is on the West coast.</li>
<li>Each server is an Intel Xeon Quad core, 2.8GHz, 16GB RAM, 4 disks of 300 GB, RAID 10.</li>
</ul>
</li>
<li><a href="http://www.softlayer.com/">SoftLayer</a> &#8211; the machines are hosted by SoftLayer.</li>
<li><a href="http://cassandra.apache.org/">Cassandra</a> &#8211; a NoSQL database selected for it&#8217;s write performance and ability to operate across multiple datacenters.</li>
<li><a href="http://en.wikipedia.org/wiki/GlassFish">Glassfish</a> &#8211; open source application server for the Java EE platform. They picked Glassfish for it&#8217;s enterprise ready features like replication and failover.</li>
<li><a href="http://hudson-ci.org/">Hudson</a> &#8211; provides for continuous testing and deployment of the backend across all their servers.</li>
<li>Java &#8211; From the start Mollom was written in Java.</li>
<li><a href="http://munin-monitoring.org/">Munin</a> &#8211; used to measure and plot metrics concerning server health.</li>
<li><a href="http://www.mysql.com/">MySQL</a> - JPA (The Java Persistence API) is used for regular data sets and Cassandra is used for large data sets.</li>
<li><a href="http://www.pingdom.com/">Pingdom</a> &#8211; used for uptime monitoring.</li>
<li><a href="http://www.zendesk.com/">Zendesk</a> &#8211; used for support.</li>
<li><a href="http://drupal.org/">Drupal</a> &#8211; used for the main website with a custom E-commerce module.</li>
<li><a href="http://unfuddle.com/">Unfuddle</a> &#8211; Subversion hosting used for source code control by their distributed development team.</li>
</ul>
<h2>What Is Mollom?</h2>
<p>Mollom is a web service for filtering out various types of spam from user generated content: comments, forum posts, blog posts, polls, contact forms, registration forms, and password request forms. Spam determination is not only based on the posted content, but also on the past activity and reputation of the poster. Mollom&#8217;s <a href="http://en.wikipedia.org/wiki/Machine_learning">machine learning</a> algorithms act as your 24&#215;7 digital moderator, so you don&#8217;t have to.</p>
<h3>How Is It Used?</h3>
<p>Applications like Drupal, for example, integrate Mollom using a Module that installs itself into content editing integration points so content can be checked for spam before being written to the database. The process looks like:</p>
<ul>
<li>When a user submit comments to a website an API call is made to the backend servers.</li>
<li>The content is analyzed, if it is spam the site will be told to block it, or if the backend is unsure it will advise the site to show a <a href="http://en.wikipedia.org/wiki/CAPTCHA">CAPTCHA</a>, which they also serve.</li>
<li>When the CAPTCHA is filled in correctly the content will be accepted. In most cases a human will not see a CAPTCHA, the content will directly be accepted as <em>ham</em>, ham being good content and <em>spam</em> being bad content.</li>
<li>CAPTCHA is only displayed with the machine learning algorithms are not 100 percent sure, so for the most part humans are not inconvenienced.</li>
</ul>
<h3>Dashboard</h3>
<p>Mollom includes a pretty <a href="http://mollom.com/scorecard">nifty dashboard</a> for each account that shows you how much ham has been accepted and how much spam has been rejected. The amount of spam that one sees in the graph is really depressing.</p>
<h3>Operations Process</h3>
<p><strong>Installation. </strong>Installation is quite easy for Drupal. Install it like any other module. Create an account on the Mollom website. Get a pair of security keys, configure these keys into the module, and select which parts of the system you want to protect with Mollum. That&#8217;s about it.</p>
<p><strong>Daily</strong>. I check regularly to see if spam has gotten through. It&#8217;s not 100% so some spam does get through, but very little. If spam does get through there&#8217;s a way to tell Mollom that this post was really spam and should be deleted. This is what you would have to do anyway, but in the process you are helping train Mollom&#8217;s machine learning algorithms about what is spam.</p>
<p><strong>Allows for anonymous user interaction</strong>. With a good spam checker it&#8217;s possible to have a site where people can interact anonymously, which is what a lot of people using certain types of sites really prefer. Once you require registration engagement goes way down and registration doesn&#8217;t stop spammers anyway.</p>
<h3>Not Everything Is Rosy</h3>
<p>Dealing with false positives is Mollom&#8217;s biggest downside. Spam detection is a difficult balancing act between rejecting ham and accepting spam. Mollom&#8217;s machine learning algorithms seems to work quite well, but there is a problem sometimes with good posts being rejected, you get the dreaded:<em>Your submission has triggered the spam filter and will not be accepted</em>. Currently there is no recourse. Few things piss off a user more than having their glorious comment rejected as spam when it&#8217;s obvious, to a human, that it&#8217;s not. A user will only try a few times to get around the problem and then they will simply give up and walk away.</p>
<p>The problem is there is no way to fix the problem. To protect the machine learning algorithms from being gamed, Mollom does not allow you to present an example of a wrongly rejected chunk of content that should be accepted, though they are working on adding this in the future.</p>
<p>It&#8217;s a tough decision. Static CAPTCHA systems, that is systems that only require a user pass a test to submit content, simply do not work once a site has been targeted for serious attack. User registration doesn&#8217;t work. Moderating every post requires a very high burden, especially for &#8220;hobby&#8221; sites, given a site can have thousands of spams a day. And spam completely kills a site, so the risk of angering some users has to be balanced against having no users in the end because of a bombed-out site.</p>
<h2>Business Model</h2>
<ul>
<li>What makes Mollom a no-brainer is it is free to try. You only pay once your site starts accepting over 100 ham messages a day. For a small site this may never happen.</li>
<li>Once you pass the free threshold there is Mollom Plus (1 EUR a day) and Mollom Premium (3600 EUR/year/site) pricing tiers, which seemed pretty reasonable.</li>
<li>The free websites are not a resource drain as you might expect, they are are actually a source of critical training data. All websites using Mollom are constantly feeding back data to the backend classifiers. The more people that using Mollom the better is the training through all the feedback it is getting from users. Without free sites Mollom wouldn&#8217;t be as accurate as it is.</li>
</ul>
<h2>Architecture</h2>
<ul>
<li>Mollom is very engineering driven. A major emphasis has been put on making Mollom as efficient as possible both in code and server resource usage.</li>
<li>Physically each server can handle all requests, but they have complete failover. Work is distributed between the machines. If one goes down then work moves to the other machine.
<ul>
<li>They used to have 3 servers, but since they improved the performance a lot they can use the third server as a staging server.</li>
<li>Each server can handle a full 100 connections per second and each connection runs the entire pipeline: full text analysis, calculating the reputations of the authors, and serving CAPTCHAS.</li>
</ul>
</li>
<li>Really optimized for low latency. Since the spam detection is part of the process of content being submitted to a site, if it took a long time it would be really annoying to users.</li>
<li>Mollom has went through several phases of evolution:
<ol>
<li>Initially a small team of two people worked part time on the algorithms, the classifiers, the real business problems they were trying to solve. To build out the infrastructure on the backend they used their own implementation of thread-pool, connection-pool, resource management. They found they were spending too much time on supporting all this stuff and making it scale. They then switched to Glassfish, a Java application server, so they would have worry a lot less about memory management, REST handling, XML parsing, and database connection pooling.</li>
<li>The main problem in the past was disk bandwidth. They need to track reputations for all IP addresses and for all URLs on the Internet, so this was a massive datastore with a lot of random access.</li>
<li>In the early days they used a cheap virtualized machine and everything was in MySQL, which didn&#8217;t scale.</li>
<li>They then moved to solid state disk and stored everything in files. Solid state solved the write problem, but there were issues:
<ol>
<li>It&#8217;s really expensive.</li>
<li>It&#8217;s very sensitive to the type of file system installed.</li>
<li>Writes are fast, but iterating over the data, which they did quite often, to clean up the data or to train a new classifier over millions of millions of small objects, was still very slow.</li>
</ol>
</li>
<li>They then moved away from solid state and moved to Cassandra.</li>
</ol>
</li>
<li>Cassandra is now used as their database for write heavy loads and as a caching layer:
<ul>
<li>Runs on a <a href="http://www.thegeekstuff.com/2010/08/raid-levels-tutorial/">RAID 10</a> disk configuration (striped and mirrored), which are very good for heavy read/writes.</li>
<li>Cassandra is optimized for writes and Mollom has a lot more writes than reads.</li>
<li>Designed to be distributed inside a datacenter and across datacenters.</li>
<li>A downside is there is no standard NoSQL interface which makes it difficult to write applications.</li>
<li>Cassandra&#8217;s row caching saves them from having to add another caching layer in their system, which removes a lot of application code.</li>
<li>Cassandra has an aging feature which will automatically deletes data after a period of time. Europe has strict privacy laws which require certain data be removed after a period of time. This feature is a huge win. It&#8217;s an extremely expensive operation and Cassandra handling it removes a lot of application code.</li>
</ul>
</li>
<li>The path a blog comment takes through the system:
<ul>
<li>A request can come from any client. Clients load balance requests across servers. This part is explained later. The typical client is a Drupal system. Requests can be XML-RPC or REST.</li>
<li>Requests are handled by a Glassfish application server and follow a typical application server workflow: requests are handled by servlets and are delegated to session beans.</li>
<li>Paying customers are serviced first, free customers may experience longer delays.</li>
<li>The request is parsed and analyzed. More on this later. A spam score is determined and returned to the user. So there are different functional parts of Mollom: spam checking and CAPTCHA handling. CAPTCHA includes generating, serving, and processing the responses. Different sessions beans are responsible for the parts of the Mollom functionality.</li>
<li>Classifiers are fully in RAM. A small piece of content is blown up and broken into thousands and thousands of tiny tokens that could identify spam. These classifiers have in the area of a few million tokens in RAM. The classifiers need to run really quickly so they have to be in RAM.</li>
<li>What is kept in Cassandra are reputation scores, frequencies, URLs, and IP addresses. Cassandra&#8217;s new row caching feature now acts as their caching layer. Previously they implemented an internal cache, but that was removed.</li>
<li>Both machines in both datacenters run Cassandra and the Glassfish application server. Cassandra is constantly replicating data between datacenters.</li>
<li>The in-memory datastructures are not replicated directly. They write through to Cassandra which then replicates. The cache on the other side has a timeout so it will go to Cassandra and pick up the new data. It&#8217;s eventually consistent. There&#8217;s a small window of inconsistency, but the models aren&#8217;t negatively impacted over such a short period of time.</li>
<li>Consistency is managed on a case-by-case basis. For reputation and IP addresses eventual consistency is fine. Session data, including CAPTCHA sessions, is kept strictly consistent so it will follow correctly when a machine fails over.</li>
</ul>
</li>
<li>Client Side Load Balancing
<ul>
<li>Mollom uses <a href="http://mollom.com/api/client-side-load-balancing">client side load balancing</a> that distributes load based on latency, etc. As a startup they didn&#8217;t have the money to buy a large load balancer. They also had a goal to be able to do global load balancing over multiple datacenters, which would have required an expensive and complicated setup.</li>
<li>Management of the client list is through an API. Each client has a list of available servers it can use.</li>
<li>Each client can get a different list of servers to use. Paying customers can be provided a list of closer servers to reduce latency.</li>
<li>When a server fails the clients will try the next server in the list.</li>
<li>Client side load balancing helped with migration from the old system to the new Glassfish based system. New users were given the address of the migrated machines and old users still worked on the old machines and could be migrated in orderly fashion by updating their server lists. This allowed testing so they could test functionality and then scaling and performance. They looked at responses times, how many connections were in the connection queue, and how long connections would stay in the queue. They could test what would happen if they increased threads in the thread pool, changed the number of JDBC connections, and other configurations. Once everyone migrated the old servers were shutdown. Very little downtime was experienced during the transition, which is key for a highly available system. While the system is down spams are getting through.</li>
<li>A downside of the client side approach is that if third party clients are written poorly then there will be problems. The client could get a server list, for example, and iterate over it in reverse order, which is wrong. They now work closely with client developers and provide quality reference code examples so developers can learn best practices.</li>
</ul>
</li>
<li>Machine Learning
<ul>
<li>Mollom is a set of learning systems. A stand-alone CAPTCHA solution, which neither considers user behavior nor point of origin, can never achieve this level of informed protection, and generally requires users to solve a CAPTCHA on every post. Using Mollom&#8217;s text analysis, users must only solve CAPTCHAs when Mollom is unsure about a post.</li>
<li>The average message length is about 500 characters and this is blown up into 3,000 features. The spaminess of a post is determined by looking at the reputation of the IP address or Open ID, they look at the user ID, sentiment, language, profanity, specific words and combinations of words, they look at how well the text has been written, etc. All these are based on classifiers. Some of the classifiers are statistical in nature that can learn in automatically. Some classifiers that are rule based to ensure they can never go wrong. It&#8217;s a combination of all these tests in the end that determine the spam score.</li>
<li>They learn from this process and the classifiers and internal metrics are updated in real-time.</li>
</ul>
</li>
<li>Glassfish handles work scheduling and was designed to handle multicore workloads.
<ul>
<li>The key is to create a design the works in parallel as much as possible and has as small a lock window as possible.</li>
<li>The number concurrent HTTP connections is tuned so they have an appropriately sized pool of available connections.</li>
<li>They uses 16 threads per server.</li>
<li>Most of the calls are handled by stateless session beans which work well for concurrency management.</li>
<li>They keep a number of session beans in a pool, but let Glassfish decide how many should be in a pool. At peak load there will be more session beans in the pool so requests can be handled efficiently. 32 session beans can be running in parallel at any given moment.</li>
<li>All the classifiers are actually session beans that are being reused by different threads.</li>
<li>Each session bean has their own client connection to Cassandra so they will not block each other.</li>
<li>When a user doesn&#8217;t respond to a CAPTCHA that session is cleaned up and the Mollom learns that was probably spam.</li>
<li>There is one instance of each classifier per server.</li>
<li>There is a very short period of lock contention on session cleanup where classifiers are being updated.</li>
<li>Updated classifiers are written back to Cassandra every 1/2 hour.</li>
</ul>
</li>
<li>Application Integration
<ul>
<li>Mollom uses an open API which can be integrated into any system.</li>
<li>Libraries: Java, PHP, Ruby, and more.</li>
<li>Integrated solutions: Drupal, Joomla, WordPress, and other content management systems.</li>
<li>Third parties generate new bindings based on example code produced by Mollom.</li>
</ul>
</li>
<li>To monitor the health of a server they continually monitor, using Munin:
<ul>
<li>What is the size of the heap after garbage collection?</li>
<li>What is the number of available connections?</li>
<li>What is the number of threads available in the thread pool? Make sure there&#8217;s no one thread waiting for a long time for a lock.</li>
</ul>
</li>
<li>When you look at their architecture, Mollom is trying to build an architecture that can transparently work out of multiple datacenters a set themselves up to scale-out when they&#8217;ve outgrown a single server system:
<ul>
<li>Client side load balancing is used to select and failover servers.</li>
<li>Glassfish clustering will be used to failover at the application tier and to make is easy to add and remove machines.</li>
<li>Cassandra will be used manage the data tier across datacenters.</li>
</ul>
</li>
<li>Netlog&#8217;s installation of Mollom has some interesting characteristics. It process more than the main Mollom.com servers, but the distribution of spam is completely different because their people communicating are part of social network. The spam distribution on Netlog 90% ham and 10% spam, where it&#8217;s exactly the reverse out in the cruel world of blogs.  The interesting implication is that it takes less resources to process ham so they can actually perform more work on the same servers.</li>
<li>They initially tried virtualized servers, and thought of using Amazon, but they found IO was the main bottleneck using shared virtualized servers. IO latency and bandwidth were real problems so they decided to go scale-up and use bigger machines and bigger disks.
<ul>
<li>Surprisingly they are not CPU bound. Only two of the 8 cores are computing. The others are just doing IO.</li>
<li>Mollom&#8217;s traffic is fairly constant, so having a dedicated servers is more cost efficient. They see Amazon more of a way to handle peak loads.</li>
</ul>
</li>
<li>Development Process
<ul>
<li>They are distributed team. Three people are in Belgium, someone in Texas, Boston, and Germany.</li>
<li>Scrum is used as the development process and they are very happy with it. The scrum meeting happens over Skype at 2PM. They&#8217;ve found as they&#8217;ve grown they need more process.</li>
<li>Developers develop locally and submit code into Unfuddle.</li>
<li>Hudson is used as their continuous integration environment. Hudson made their migration from the old to the new Glassfish system easier because tests had to be passed before deployment. They didn&#8217;t lose much time because problems were found before production deployment.</li>
<li>They have many tests: unit tests, system tests, Drupal tests. Only when these tests are passed by Hudson can the system be deployed.</li>
<li>Deployment is still done manually to reduce potential downtime.</li>
<li>Whenever they found a problem with garbage collection times it was always due to a memory leak in their application. In case of a memory leak they take a core dump. To analyze a core dump from a 16GB machine is not that easy, you probably can&#8217;t analyze it on your local machine, so what they do is rent a large memory instance on Amazon to analyze the dump. It takes about 2 hours to process the heap dump. They compare two dumps, one after 10 hours of execution time and one after 20 hours of execution time. If there are major differences then it&#8217;s probably because of a memory leak.</li>
</ul>
</li>
</ul>
<h2><strong>Future Directions</strong></h2>
<ul>
<li>Mollom API uses XML-RPC, they are now testing a REST implementation to make it easier for services to mashup with Mollom.</li>
<li>Now that they&#8217;ve transitioned to Cassandra it will be easier for them to scale-out when growth dictates.</li>
<li>Soon to be released are enterprise features that make it possible to manage hundreds of websites as a unit. It will be easy to moderate across a set of websites by sentiment, spam score, or delete all the comments from certain IP addresses.</li>
<li>They&#8217;ve had some talks about getting into the streaming data business like Twitter, but they are restricted by Europe&#8217;s stricter privacy policies.</li>
<li>They will experiment using Glassfish for load balancing in each datacenter.</li>
<li>If the load goes up 10x they will have to add more Cassandra nodes. Disk IO is the bottleneck. Only when they have to grow more than 10x will they need to add more application servers.</li>
</ul>
<h2>Lessons Learned</h2>
<ul>
<li><strong><strong>Efficiency leads to happiness. </strong>Mollom takes high performance engineering very seriously. They are proud that Mollom is extremely cost effective. It can handle many many requests on a single server, with low latency, which makes customers happy, which makes them happy because they don&#8217;t have to maintain lots of machines, and costs are low. They&#8217;ve made this a priority from the start and have chosen the right technologies to achieve their goals. This enabled them to take the profit they&#8217;ve made and invest in marketing, building a user base, and building new products on top of Mollom.</strong></li>
<li><strong>Free for breadth, pay for depth</strong>. Machine learning requires a lot of example data to be able to detect spam successfully. To get that data Mollom offers a free service to customers, who provide the breadth of data needed to better train the learning algorithms, they are a constant source of intelligence and feedback. Larger customers provide the revenue and also benefit from the data learned from the free clients. This model seems peculiar to big data and machine learning, which as we all know is the future of everything.</li>
<li><strong>Remove non domain-specific obstacles</strong>. Big systems take a lot of infrastructure work. The infrastructure effort often takes work away from work on a product&#8217;s true value producing domain related features (classifiers, reputation systems, client libraries). Mollom consciously tried to get out of the infrastructure business as much as possible with their selections of Cassandra and Glassfish.</li>
<li><strong>Be careful with client side code</strong>. Code on the client side is attractive because it uses other people&#8217;s resources instead of yours. The problem is that code can be poorly written which will make your system look bad. Work closely with client developers and provide quality reference code examples so developers can learn best practices.</li>
<li><strong>Prioritize paying customers</strong>. Paying customers get a better quality of service. They are handled first in the queues and experience less delay through the system. Paying customers have access to failover server and free customers only have one server.</li>
<li><strong>Reduce code by letting the stack do the heavy lifting</strong>. In the early days the Mollom code base was a lot bigger than it is now. Cassandra removed a lot of complex code by handling replication and row caching and Glassfish removed a lot application code and will handle clustering. Simplify over time.</li>
<li><strong>Minimize lock contention</strong>. Mollom spent a lot of time working on minimizing lock contention in the Glassfish server as this became the major bottleneck. Lock as little is possible to maintain full parallelism.</li>
</ul>
<h2>Related Articles</h2>
<ul>
<li><a href="http://mollom.com/api">Mollom API</a></li>
<li><a href="http://drupal.org/project/mollom">Mollom Drupal Module</a></li>
<li><a href="http://mollom.com/files/mollom-technical-whitepaper.pdf">Mollom Technical Whitepaper</a></li>
<li><a href="http://twitter.com/#!/mollom">Mollom&#8217;s Twitter Account</a></li>
<li><a href="http://blogs.sun.com/glassfishpodcast/entry/episode_072_mollom_com_s">Episode #072 &#8211; Mollom.com&#8217;s GlassFish backend with Dries and Johan</a></li>
<li><a href="http://buytaert.net/mollom-gets-a-new-backend">Mollom gets a new backend</a></li>
<li><a href="http://blogs.lodgon.com/johan/">Fighting spam with Mollom on Glassfish</a></li>
<li>To learn more about big data and machine learning take a look at some materials from the <a href="http://strataconf.com/strata2011/public/schedule/proceedings">Strata Conference</a>.</li>
</ul>
<p><a href="http://highscalability.com/blog/2011/2/8/mollom-architecture-killing-over-373-million-spams-at-100-re.html">http://highscalability.com/blog/2011/2/8/mollom-architecture-killing-over-373-million-spams-at-100-re.html</a></p>
<p>Related posts:<ol>
<li><a href='http://www.dbasoul.com/2011/999.html' rel='bookmark' title='Friends For Sale Architecture &#8211; A 300 Million Page View/Month Facebook RoR App'>Friends For Sale Architecture &#8211; A 300 Million Page View/Month Facebook RoR App</a></li>
<li><a href='http://www.dbasoul.com/2011/1018.html' rel='bookmark' title='Playfish&#8217;s Social Gaming Architecture &#8211; 50 Million Monthly Users And Growing'>Playfish&#8217;s Social Gaming Architecture &#8211; 50 Million Monthly Users And Growing</a></li>
<li><a href='http://www.dbasoul.com/2011/995.html' rel='bookmark' title='Mailinator Architecture'>Mailinator Architecture</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://www.dbasoul.com/2011/1020.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
