<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    <title>Mike's Solaris Blog</title>
    <link>http://blog.hindsight.it/</link>
    <description>Random thoughts in a Solaris world..</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.5.2 - http://www.s9y.org/</generator>
    
    <image>
        <url>http://blog.hindsight.it/templates/default/img/s9y_banner_small.png</url>
        <title>RSS: Mike's Solaris Blog - Random thoughts in a Solaris world..</title>
        <link>http://blog.hindsight.it/</link>
        <width>100</width>
        <height>21</height>
    </image>

<item>
    <title>Jumpstarting large machines</title>
    <link>http://blog.hindsight.it/index.php?/archives/17-Jumpstarting-large-machines.html</link>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/17-Jumpstarting-large-machines.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=17</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=17</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    I recently found myself with an interesting problem to solve while jumpstarting some very large SPARC machines. Aside from some of the neat functionality that has been stuck in there since I last looked at it in detail, there&#039;s absolutely nothing new about jumpstarting machines, and although I&#039;ve jumpstarted dozens/hundreds of machines of varying sizes over the years, there was significant expectation (and risk) to upgrading these particular machines and I wanted to make sure that I got it right first time, every time. &lt;p&gt; &lt;/p&gt; &lt;br /&gt;
&lt;p&gt;These particular machines are 32-way SPARC64 V based Fujitsu machines (actually 64-way, but each chassis is divided into two partitions). They have numerious I/O controllers (off the top of my head, I&#039;d estimate about 16 controllers), with an even split of both traditional SCSI and fibrechannel controllers that were installed at various points during the lifecycle of the old OS. Most controllers have devices visible on them - either locally installed hard drives, or LUNs visible across the SAN.&lt;/p&gt; &lt;br /&gt;
&lt;p&gt;My plan was to retain the pair of drives from the old Operating System (Solaris 8), and install the new OS onto a fresh pair of drives which were already connected - this would facilitate an easy back-out, should it be required. I could not guarantee that the controller numbers would not renumber during the re-install - and to have Jumpstart accidentally select the wrong devices could lead to the old OS image being overwritten, data loss, or simply just the wrong devices being selected so that the disks are not mirrored between separate system boards as intended.&lt;br /&gt;&lt;/p&gt; &lt;br /&gt;
&lt;p&gt;Now, armed with the engineering manual of the PrimePOWER server in question, I could have researched the probe order of the PCI cards, and correleated this with the existing installation, but to be frank, I&#039;m way too lazy to do that. I&#039;d rather find another less error-prone method of forcing Jumpstart to select the correct devices for install. Additionally, given that a full boot/POST/reboot cycle on these machines can take in excess of an hour, I was keen to ensure that the outage window for the work was kept to an absolute mininum - this meant that I had to be absolutely sure that I was going to get it right first time.&lt;br /&gt;&lt;/p&gt; &lt;br /&gt;
&lt;p&gt;I think I found a good, simple solution. In fact, I&#039;m surprised that this functionality isn&#039;t already allowed in the jumpstart profile.&lt;br /&gt;&lt;/p&gt; &lt;br /&gt;
&lt;p&gt; Within Solaris, we already have a unique device identifier that will persist across reboots and OS re-installs - the physical device path. So, rather than specificying the intended devices using the traditional cXtYdZ notation, we should be able to specify a physical device path (for example on this particular machine type it would take the format something like this: /pci@86,4000/scsi@4/sd@0,0). In fact - we don&#039;t need to specify the &lt;strong&gt;entire&lt;/strong&gt; physical path - the only requirement is that we specify enough to uniquely identify &lt;em&gt;&lt;strong&gt;one&lt;/strong&gt;&lt;/em&gt; device, but in most cases it&#039;s probably safer to fully specify that path.&lt;br /&gt;&lt;/p&gt; &lt;br /&gt;
&lt;p&gt;In order to facilitiate this, I used a jumpstart &lt;a href=&quot;http://docs.sun.com/app/docs/doc/820-0179/6nbugkooh?a=view&quot;&gt;derived profile&lt;/a&gt;, and using a very basic &amp;quot;begin&amp;quot; script that will check the system hostname, and according to a case statement will identify which physical devices are appropriate for this host. It will then lookup the logical access path in cXtYdZ format and generate the necessary profile for it.&lt;/p&gt; &lt;br /&gt;
&lt;p&gt;I&#039;ll not fully reproduce the script here because it contains references to client config and other functionality that I may write about in the future, but the following code fragments ought to give you the idea (hostnames have been changed to protect the innocent):&lt;/p&gt; &lt;br /&gt;
&lt;blockquote&gt; &lt;br /&gt;
&lt;pre&gt;case `hostname` in&lt;br /&gt;
&lt;br /&gt;
pw2500a)&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SOL10_ROOT=/pci@86,4000/scsi@4/sd@0,0&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SOL10_MIRROR=/pci@a4,4000/scsi@4/sd@0,0&lt;br /&gt;
        ;;&lt;br /&gt;
&lt;br /&gt;
pw2500b)&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SOL10_ROOT=/pci@8a,4000/scsi@4/sd@0,0&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SOL10_MIRROR=/pci@94,4000/scsi@4/sd@0,0&lt;br /&gt;
        ;; &lt;br /&gt;
*)&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; gameOver &quot;ERROR: `hostname` not found in config&quot;&lt;br /&gt;
        ;;&lt;br /&gt;
esac&lt;br /&gt;
######################&lt;br /&gt;
# Translate a physical device to a logical cXtXdX &lt;br /&gt;
translateDevice() {&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if [ $# -ne 1 ] ; then&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; gameOver &quot;ERROR : translateDevice() called with no args&quot;&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; fi&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; DEVICE=$(ls -l /dev/dsk/c*s0 |grep $1 |awk &#039;{print $(NF-2)}&#039;) &lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if [ -z &quot;$DEVICE&quot; ] ; then&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; gameOver &quot;ERROR: $1 not found in /dev/dsk/&quot;&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; fi&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if [ $(echo $DEVICE |wc -w) -gt 1 ] ; then&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; gameOver &quot;ERROR: $1 found, but returned multiple disks ($DEVICE)&quot;&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; fi&lt;/pre&gt; &lt;br /&gt;
&lt;pre&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; print $DEVICE | cut -d/ -f4 | sed &#039;s/s0$//&#039;&lt;br /&gt;
}&lt;br /&gt;
######################&lt;br /&gt;
# Main program&lt;br /&gt;
if [ -z &quot;$SOL10_ROOT&quot; ] ; then&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; gameOver &quot;ERROR: SOL10_ROOT not defined for `hostname`&quot;&lt;br /&gt;
fi &lt;br /&gt;
SOL10_ROOT=$(translateDevice $SOL10_ROOT)&lt;br /&gt;
if [ -z &quot;$SOL10_ROOT&quot; ] ; then&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; exit 1&lt;br /&gt;
fi&lt;br /&gt;
SOL10_ROOT=&quot;${SOL10_ROOT}sXSECTIONX&quot;&lt;br /&gt;
SOL10_DEVS=$SOL10_ROOT&lt;/pre&gt; &lt;br /&gt;
&lt;pre&gt;if [ ! -z &quot;$SOL10_MIRROR&quot; ] ; then&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SOL10_MIRROR=$(translateDevice $SOL10_MIRROR)&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if [ -z &quot;$SOL10_MIRROR&quot; ] ; then&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; exit 1&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; fi&lt;/pre&gt; &lt;br /&gt;
&lt;pre&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SOL10_MIRROR=&quot;${SOL10_MIRROR}sXSECTIONX&quot;&lt;br /&gt;
&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; SOL10_DEVS=&quot;$SOL10_ROOT $SOL10_MIRROR&quot;&lt;br /&gt;
fi&lt;/pre&gt; &lt;br /&gt;
&lt;pre&gt;print &quot;filesys mirror $SOL10_DEVS 10240 /&quot;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; |sed &#039;s/XSECTIONX/0/g&#039;&lt;br /&gt;
print &quot;filesys mirror $SOL10_DEVS 4096 swap&quot;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; |sed &#039;s/XSECTIONX/1/g&#039;&lt;br /&gt;
print &quot;filesys mirror $SOL10_DEVS 6144 /var&quot;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; |sed &#039;s/XSECTIONX/3/g&#039;&lt;br /&gt;
print &quot;filesys mirror $SOL10_DEVS free /export&quot;         |sed &#039;s/XSECTIONX/5/g&#039;&lt;br /&gt;
print &quot;metadb $SOL10_ROOT size 8192 count 3&quot;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; |sed &#039;s/XSECTIONX/7/g&#039;&lt;/pre&gt; &lt;br /&gt;
&lt;pre&gt;if [ ! -z $SOL10_MIRROR ] ; then&lt;br /&gt;
print &quot;metadb $SOL10_MIRROR size 8192 count 3&quot;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; |sed &#039;s/XSECTIONX/7/g&#039;&lt;br /&gt;
fi&lt;/pre&gt; &lt;br /&gt;
&lt;/blockquote&gt; &lt;br /&gt;
&lt;p&gt; &lt;/p&gt; &lt;br /&gt;
&lt;p&gt; &lt;/p&gt; &lt;br /&gt;
&lt;p&gt; (any errors or omissions in that script are likely to be due to me copy/pasting it into this entry..). &lt;/p&gt; &lt;br /&gt;
&lt;p&gt;Importantly, this method also allows the user to run this script on the &lt;br /&gt;
host prior to the jumpstart actually being performed, and it should &lt;br /&gt;
&lt;strong&gt;still&lt;/strong&gt; accurately&amp;#160; identify the correct devices using their current &lt;br /&gt;
logical access paths. A useful pre-install check:&lt;br /&gt;&lt;/p&gt; &lt;br /&gt;
&lt;pre&gt;&lt;/pre&gt; &lt;br /&gt;
&lt;p&gt; &lt;/p&gt; &lt;br /&gt;
&lt;blockquote&gt; &lt;br /&gt;
&lt;pre&gt;$ ./getDiskLayout&lt;br /&gt;
filesys mirror c1t0d0s0 c5t0d0s0 10240 /&lt;br /&gt;
filesys mirror c1t0d0s1 c5t0d0s1 4096 swap&lt;br /&gt;
filesys mirror c1t0d0s3 c5t0d0s3 6144 /var&lt;br /&gt;
filesys mirror c1t0d0s5 c5t0d0s5 free /export&lt;br /&gt;
metadb c1t0d0s7 size 8192 count 3&lt;br /&gt;
metadb c5t0d0s7 size 8192 count 3&lt;br /&gt;
$&lt;/pre&gt; &lt;br /&gt;
&lt;/blockquote&gt; &lt;br /&gt;
&lt;p&gt; &lt;/p&gt; &lt;br /&gt;
&lt;p&gt;The end result - every single machine built first time correctly, without a significant amount of manual investigation into the physical device path and PCI probe order. When using derived profiles, there are many ways that the same functionality could be implemented, and this illustrates just one of those methods. &lt;/p&gt; 
    </content:encoded>

    <pubDate>Sat, 24 Apr 2010 06:08:25 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/17-guid.html</guid>
    
</item>
<item>
    <title>Shared memory controls in Solaris 10</title>
    <link>http://blog.hindsight.it/index.php?/archives/16-Shared-memory-controls-in-Solaris-10.html</link>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/16-Shared-memory-controls-in-Solaris-10.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=16</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=16</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
The controls for IPC settings such as shared memory and semaphores have changes somewhat in Solaris 10. This seems to be a cause of regular confusion, and recently a friend asked me to clarify how they ought to be configured. &lt;br /&gt;
&lt;p&gt;The new facilities in Solaris 10 makes such settings dynamic, but the controls to prevent overallocation are still there, albeit in a slightly modified form.&lt;/p&gt;&lt;p&gt;There are two approaches these days..&lt;/p&gt;&lt;ol&gt;&lt;li&gt;/etc/system (yawn, requires a reboot - why would you?)&lt;/li&gt;&lt;li&gt;Projects&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;The IPC settings are entirely controlled by projects these days.. the ability to set values in /etc/system is there solely for backwards compatibility, and simply sets the default values for the projects, and therefore should be considered deprecated.&lt;/p&gt;&lt;p&gt;Probably the best way to explain the use of projects to set default values is to run through a quick worked example of setting the shared memory max allocation (shmmax), with comments inline:&lt;/p&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;$ uname -a&lt;br /&gt;
SunOS deepthought 5.11 snv_69 i86pc i386 i86pc&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;
It&#039;s reporting 5.11 as it&#039;s an OpenSolaris build, but same priciples apply to 5.10&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;$ id -a&lt;br /&gt;
uid=500(mike) gid=100(users) groups=100(users)&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;The user we&#039;re running as has no specific privs, such as would be the case with an Oracle application account (for example). There is no settings in /etc/system, so we&#039;re running with default values. &lt;/p&gt;&lt;p&gt;As an aside, this is running under VMware server on a Linux host with 256MB memory allocated to the VM. The default value of shmmax is calculated as physmem/4, so we ought to see maximum allocation around about 64MB&lt;/p&gt;&lt;p&gt;We can check what the max shm is going to be using Solaris 10 prctl command:&lt;/p&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;$ prctl -n project.max-shm-memory $$&lt;br /&gt;
process: 6905: ksh -o vi&lt;/pre&gt;&lt;pre&gt;NAME    PRIVILEGE       VALUE    FLAG   ACTION                       RECIPIENT&lt;br /&gt;
project.max-shm-memory&lt;br /&gt;
         privileged      61.9MB      -   deny                                 -&lt;br /&gt;
         system          16.0EB    max   deny                                 -&lt;br /&gt;
$&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;
&lt;p&gt;That&#039;s reporting 61.9MB, which is pretty close to the expected 64MB. I wonder where the other 3.1MB went?&lt;/p&gt;&lt;p&gt;And to prove the point, I&#039;ve just written a wee program that tests this, by creating (and destroying) increasingly large segments of shared memory (incrememnting by 1MB at each iteration). It stops when it fails and reports the last created seg.&lt;/p&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;$ ./findmax&lt;br /&gt;
Shared seg created with 61.0 MB&lt;br /&gt;
shmget failed!&lt;br /&gt;
$&lt;/pre&gt;&lt;/blockquote&gt;&lt;p&gt;So, here&#039;s how we increase it.. We need root privs, or a friendly sysadmin for this, as we&#039;re adding a new project for this user only:&lt;/p&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;$ su&lt;br /&gt;
Password:&lt;br /&gt;
# projadd -U mike -K &amp;quot;project.max-shm-memory=(priv,80MB,deny)&amp;quot; user.mike&lt;br /&gt;
#&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;
&lt;p&gt;Back on as me now.. (notice I&#039;m logging on again by su&#039;ing back to myself, rather than just exiting the &amp;quot;su&amp;quot; shell - this is deliberate)&lt;/p&gt;&lt;br /&gt;
&lt;blockquote&gt;&lt;pre&gt;# su - mike&lt;/pre&gt;&lt;pre&gt;Sun Microsystems Inc.   SunOS 5.11      snv_69  October 2007&lt;br /&gt;
$ prctl -n project.max-shm-memory $$&lt;br /&gt;
process: 7044: ksh&lt;br /&gt;
NAME    PRIVILEGE       VALUE    FLAG   ACTION                       RECIPIENT&lt;br /&gt;
project.max-shm-memory&lt;br /&gt;
         privileged      80.0MB      -   deny                                 -&lt;br /&gt;
         system          16.0EB    max   deny                                 -&lt;br /&gt;
$&lt;br /&gt;
$ ./findmax&lt;br /&gt;
Shared seg created with 80.0 MB&lt;br /&gt;
shmget failed!&lt;br /&gt;
$&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;
 
    </content:encoded>

    <pubDate>Tue, 28 Aug 2007 03:06:06 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/16-guid.html</guid>
    
</item>
<item>
    <title>'crypt'ing in Linux..</title>
    <link>http://blog.hindsight.it/index.php?/archives/15-crypting-in-Linux...html</link>
            <category>Linux</category>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/15-crypting-in-Linux...html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=15</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=15</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;Anybody else noticed that the Solaris &#039;crypt&#039; command isn&#039;t in common use in the Linux community?&lt;/p&gt; &lt;br /&gt;
This proved to be an issue for me a little while back while transferring some data between companies and systems. Fortunately a quick visit to &lt;a href=&quot;http://cvs.opensolaris.org/source/xref/on/usr/src/cmd/crypt/crypt.c&quot;&gt;OpenSolaris&lt;/a&gt; provided the source, which only required a small modification in order to allow it to be compiled for use under Linux..&lt;blockquote&gt;67c67&lt;br /&gt;&amp;lt;       ret = des_crypt(buf, &amp;amp;buf[8]);&lt;br /&gt;---&lt;br /&gt;&amp;gt;       ret = crypt(buf, &amp;amp;buf[8]);&lt;/blockquote&gt;&lt;p&gt;Yet another advantage to Solaris being open-sourced...&lt;/p&gt;&lt;br /&gt;
 
    </content:encoded>

    <pubDate>Mon, 07 Aug 2006 11:21:13 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/15-guid.html</guid>
    
</item>
<item>
    <title>Demystifiying ACp..</title>
    <link>http://blog.hindsight.it/index.php?/archives/14-Demystifiying-ACp...html</link>
            <category>SANs</category>
            <category>Unix</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/14-Demystifiying-ACp...html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=14</wfw:comment>

    <slash:comments>5</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=14</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
After a bit of healthy debate in the office about the merits and&lt;br /&gt;
implications of using adaptive copy mode (ACp) with EMC&#039;s SRDF, I wanted to clarify my own&lt;br /&gt;
thoughts on how it operates, and the benefits of using it.&lt;br /&gt;
 &lt;br /&gt;
&lt;p&gt; &lt;/p&gt;Firstly, I guess, for the uninitiated, it&#039;s a question of what is it exactly is it? - Well, EMC has their SRDF (Symmetrix Remote Data Facility) data replication protocol that allows data to be replicated over distance from one array to another.&lt;br /&gt;&lt;br /&gt;Normally, whereever possible, when running replication at the hardware layer like this (where the hardware has no concept of application transactional consistency), you need the replication to perform synchronously with the I/Os. This essentially means that when a write operation is performed on the source array, it is flushed from disk and committed to the destination (i.e. remote) array &lt;strong&gt;before&lt;/strong&gt; the operation is ack&#039;d to the server as being completed.&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Clearly there is a time overhead with this kind of arrangement, dictated by two factors - the (relatively constant, dependant on array loading) &amp;quot;rippling&amp;quot; effect of the write operation having to pass through two arrays before getting a final &amp;quot;commit&amp;quot;, and also the transmission and acknowledgement time (as dictated by the speed of light and the distance between the arrays, which can be many kilometres apart).&lt;/p&gt;&lt;p&gt;From this (the aggregate of two times the transmission latency and the write latency for the remote array), it can be easily seen that the distance has an immediate bottleneck (not necessisarily in bandwidth, but certainly in transaction time - roundtrip times of 20-30ms is not unusual).&lt;/p&gt;&lt;p&gt;As the long-haul links between the arrays become congested, performance will rapidly degrade, and because of the synchronous nature your application will run slowly (I&#039;ve seen write operations in excess of 300-400ms reported by the OS in bad cases)&lt;/p&gt;&lt;p&gt;In contrast, an async transfer mode will commit the write operation to local storage while transmitting the write to the remote array in the backgroup. This gives performance comparable to that of a non-SRDF setup, at the expense of the risk of missing I/O transactions in the event of a link or primary site failure. Because of the asynchronous nature of the SRDF transaction, even when the long-haul links between the two arrays are congested, your application will perform well and the SRDF updates will conclude at the next available opportunity.&lt;/p&gt;&lt;p&gt;And so, back to ACp - a compromise between the two.&lt;/p&gt;&lt;p&gt;The trouble with Synchronous is that you don&#039;t necessarily want your application slowing down at &lt;strong&gt;every&lt;/strong&gt; busy spot during the day, perhaps you want to make more efficient usage of your bandwidth by playing the odds. You might, for example feel that running your long haul link at an extremely high utilisation is more cost effective than an upgrade in bandwidth. ACp will allow you to do this, but with the caveat that during peak load, you will be slightly out of sync.&lt;br /&gt;&lt;br /&gt;ACp introduces the concept of the skew value.. essentially a threshold, counted in number of  write operations. Below the skew value, the device pairing operates in async mode, and above the threshold it switches to synchonous mode (the skew value normally defaults to 65536 operations).&lt;br /&gt;&lt;br /&gt;So, for example, playing the odds.. If, by running your long haul link at a higher utilisation meant that you couldn&#039;t keep your local app running quickly enough due to the link contention and latency, ACp may help. You might not be fully consistent one hundred percent of the day, but 98% might be good enough, especially when coupled with the local recovery abilities of journalled filesystems and modern DBMS.&lt;/p&gt;Personally, I have only ever used ACp to mitigate high-bandwidth situations, for example bringing a new pair of arrays online, and having to perform a full SRDF establish of many terabytes of data. However.. the gotcha here is, depending on what &lt;b&gt;else&lt;/b&gt; is using your long-haul links.. you may want to set your other services to ACp also, lest your high-bandwidth transfers have a knock-on effect on other (synchronous) pairings that are going on over the same links.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
 
    </content:encoded>

    <pubDate>Tue, 25 Jul 2006 09:16:45 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/14-guid.html</guid>
    
</item>
<item>
    <title>Back in Blighty..</title>
    <link>http://blog.hindsight.it/index.php?/archives/13-Back-in-Blighty...html</link>
            <category>Motorcycling</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/13-Back-in-Blighty...html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=13</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=13</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;I&#039;ve been debating whether or not to post this here, as it&#039;s not strictly a unixy thing, but what the hell.. I&#039;m ethused enough to be boring enough people locally about it, so a few more isn&#039;t going to harm the world.&lt;/p&gt;&lt;p /&gt; &lt;br /&gt;
&lt;p&gt;I&#039;m just back in the country after spending two and a half weeks touring Eastern Europe by motorbike. It&#039;s the latest in a series of trips that I&#039;ve made over the years. &lt;/p&gt;&lt;p&gt;It wasn&#039;t the longest, the fastest, or probably not even the most challenging (I think the Sahara desert two years ago wins that prize), but it was certainly the most entertaining. Fifteen countries and 4,135miles in 18 days. Brilliant.&lt;/p&gt;&lt;p&gt;Full story &lt;a href=&quot;http://photo.hindsight.it/2006-07-24,%20Eastern%20Europe/&quot;&gt;here&lt;/a&gt;&lt;/p&gt;&lt;p /&gt; 
    </content:encoded>

    <pubDate>Tue, 25 Jul 2006 07:18:01 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/13-guid.html</guid>
    
</item>
<item>
    <title>What do you do with your decommissioned hardware?</title>
    <link>http://blog.hindsight.it/index.php?/archives/12-What-do-you-do-with-your-decommissioned-hardware.html</link>
            <category>Unix</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/12-What-do-you-do-with-your-decommissioned-hardware.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=12</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=12</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;We recently purchased a few drives to fit into one of our servers. I&#039;ll not say who supplied the drives, or even what kind of server, suffice to say it&#039;s not Sun, and it runs Solaris.&lt;/p&gt;&lt;p&gt;What arrived was a bit disturbing.&lt;/p&gt;&lt;br /&gt;
 &lt;br /&gt;
&lt;p&gt;The drives were purchased as new parts, and were delivered fully sealed and packaged as a new drive.&lt;/p&gt;&lt;p&gt;We installed the drives, and noticed that the VTOC on one of the drives was a bit different to what we&#039;d normally expect, so we checked with fstyp, and found (to our surprise) a few UFS filesystems.&lt;/p&gt;&lt;p&gt;To our astonishment we then proceeded to mount up a root filesystem from a system that was clearly from another UK company (in particular, a large mobile telephone provider).&lt;/p&gt;&lt;p&gt;The root filesystem in question appeared to have been shut down cleanly, and there was no tell-tale signs to suggest that it was failed drive that had been remanufactured. Presumably it is from a system that has been traded in, or otherwise decommissioned.&lt;/p&gt;&lt;p&gt;Out of interest, we noted that the &amp;quot;root&amp;quot; and &amp;quot;oracle&amp;quot; accounts had encryptions in /etc/shadow, and proceeded to run a password cracker against it.&lt;/p&gt;&lt;p&gt;&lt;br /&gt;So - what do YOU do with your decommissioned kit?, and perhaps more pertinently, what does your &lt;b&gt;vendor&lt;/b&gt; do with your decommissioned kit?&lt;/p&gt;&lt;p /&gt; 
    </content:encoded>

    <pubDate>Wed, 21 Jun 2006 05:09:07 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/12-guid.html</guid>
    
</item>
<item>
    <title>Expect the unexpected</title>
    <link>http://blog.hindsight.it/index.php?/archives/11-Expect-the-unexpected.html</link>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/11-Expect-the-unexpected.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=11</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=11</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;Working a particularly troublesome problem this week reminded me of two things: I realised (not for the first time in my career) that sometimes the unlikliest root cause will be there to bite you in the ass. &lt;/p&gt;&lt;p&gt;It also reminded me of a phrase once used by the late &lt;a href=&quot;http://www.bbc.co.uk/radio1/johnpeel/&quot;&gt;John Peel&lt;/a&gt;:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;I never make stupid mistakes. Only very, very clever ones.&lt;/p&gt;&lt;/blockquote&gt;&lt;p /&gt; &lt;br /&gt;
&lt;p&gt;You see, I&#039;m currently working a plan to decommission a whole raft of legacy E10000 systems from the current client. &lt;/p&gt;&lt;p&gt;We moved a four-board domain last week from one datacentre to another, using a previously idle domain on the destination platform. Everything went smoothly - all three SANs were visible, the network appeared to come up, we re-applied a new Veritas license (because the hostid had changed). However - after I did a final reboot before leaving for the night I noted that I was having trouble connecting to the public network interface. In fact, I was seeing in excess of 95% packet loss on inbound traffic.&lt;/p&gt;&lt;p&gt;Backing out the change was not a (pleasant) option, so instead I updated DNS and the defaultrouter to divert all traffic through the backup network as a workaround, and left for the night (it was getting very late, and this wasn&#039;t a production host).&lt;/p&gt;&lt;p&gt;The following day I rustled up a member of the comms team, and we spent an afternoon in the datacentre, trying to troubleshoot in a structured manner. The results of this testing was annoyingly inconclusive - the problem was showing up on multiple cables, multiple NICs, multiple hosts and multiple switch ports.&lt;/p&gt;&lt;p&gt;As a workaround to try and ensure that the public and backup traffic don&#039;t beat each other up, we decided to connect the public network interface into the backup subnet.&lt;/p&gt;&lt;p&gt;That was the very, very clever mistake. Normally this wouldn&#039;t be a problem, because normally we have local-mac-address set, to ensure that each NIC on the machine has a unique Ethernet address. Of course - this caused problems at the first point that significant traffic was being passed through the backup interface - but when the problems were reported I noted the configuration on &amp;quot;ifconfig&amp;quot;, it struck me that the behaviour that we were seeing was remarkably similar to the &lt;i&gt;other&lt;/i&gt; problem with the public network.&lt;/p&gt;&lt;p&gt;I quickly logged onto a machine with a connection to that public network, performed a broadcast ping (&amp;quot;ping -s 255.255.255.255&amp;quot;), then checked the ARP table and compared it with the Ethernet address of the problem domain. My suspiscions were confirmed... we have two E10k domains with the same Ethernet address (and hostid), and both are set with &amp;quot;local-mac-address&amp;quot; to false. This must have been a deliberate action, as Sun must have generated the necessary key to allow the domain to be generated like so, but the real reason why this sysadmin trap was configured like so has been lost over the years.&lt;/p&gt;&lt;p&gt;Sheesh... &lt;/p&gt; 
    </content:encoded>

    <pubDate>Sat, 20 May 2006 08:08:27 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/11-guid.html</guid>
    
</item>
<item>
    <title>Vx Storage Foundation Free!</title>
    <link>http://blog.hindsight.it/index.php?/archives/10-Vx-Storage-Foundation-Free!.html</link>
            <category>Linux</category>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/10-Vx-Storage-Foundation-Free!.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=10</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=10</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
Good grief - I&#039;ve only just heard about &lt;a href=&quot;http://www.symantec.com/enterprise/sfbasic/index.jsp&quot;&gt;this&lt;/a&gt;&lt;br /&gt;
 There are of course restrictions (Linux/Solaris x64 and only a limted number of volumes). Is this evidence of the Veritas, &lt;strong&gt;cough&lt;/strong&gt; Symantec coming under ZFS pressure? 
    </content:encoded>

    <pubDate>Thu, 18 May 2006 14:15:53 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/10-guid.html</guid>
    
</item>
<item>
    <title>Failover article online</title>
    <link>http://blog.hindsight.it/index.php?/archives/9-Failover-article-online.html</link>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/9-Failover-article-online.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=9</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=9</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
As &lt;a href=&quot;http://blog.hindsight.it/index.php?/archives/5-Clones,-Failovers-and-Migrations.html#comments&quot;&gt;requested&lt;/a&gt;, I&#039;ve put the Clones, Failovers and Migrations article online &lt;a href=&quot;http://hindsight.it/sysadmin&quot;&gt;here&lt;/a&gt;. As always - feedback is appreciated, in fact.. consider it mandatory if you find it useful &lt;img src=&quot;http://blog.hindsight.it/templates/default/img/emoticons/wink.png&quot; alt=&quot;;-)&quot; style=&quot;display: inline; vertical-align: bottom;&quot; class=&quot;emoticon&quot; /&gt;&lt;br /&gt;
  
    </content:encoded>

    <pubDate>Thu, 18 May 2006 13:59:09 -0400</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/9-guid.html</guid>
    
</item>
<item>
    <title>Rsync saves the day</title>
    <link>http://blog.hindsight.it/index.php?/archives/8-Rsync-saves-the-day.html</link>
            <category>Linux</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/8-Rsync-saves-the-day.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=8</wfw:comment>

    <slash:comments>0</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=8</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;Installing a replacement drive and being unable to shrink a filesystem to fit to the smaller replacement drive. I encountered an interesting problem today - how to copy a large dataset containing many hardlinked files.&lt;/p&gt;&lt;p&gt;Hard links are used relatively rarely, and so don&#039;t normally cause an issue - however - there was an interesting solution using &lt;a href=&quot;http://www.samba.org/rsync/&quot;&gt;rsync&lt;/a&gt;..&lt;/p&gt;&lt;p /&gt; &lt;br /&gt;
At home I have a Red Hat, &lt;a href=&quot;http://fedora.redhat.com/&quot;&gt;Fedora &lt;/a&gt;based server, which earns its&#039; keep by serving a number of roles. The most fundamental facility that it provides is fileserver, with a pair of (mirrored) 160GB drives.&lt;br /&gt;&lt;br /&gt;Earlier in the week one of these drives failed on me, which after finding that its&#039; warranty had expired, I proceeded to order a replacement. &lt;br /&gt;&lt;br /&gt;With the benefit of hindsight, I guess I ought to have ordered an obviously larger drive (say, 200GB). Without that piece of common sense.. drive replacements are always a bit of a lottery - especially when the previous make/model is no longer a viable option. Needless to say - the replacement turned out to be about 4GB smaller than the original. &lt;br /&gt;&lt;br /&gt;Being a hardened Solaris geek, I guess I&#039;m just a little too used to having the benefit of VxFS in production environments, which would have made shrinking the filesystem an absolute doddle.&lt;br /&gt;&lt;br /&gt;However, with the aid of Google, it began to be clear that shrinking the existing ext3 filesystem so that I wouldn&#039;t truncate it on the smaller disk was going to be difficult.&lt;br /&gt;&lt;br /&gt;&amp;quot;No problem&amp;quot; thinks I, proceeding to dump/restore the filesystem across onto the new disk.  Shortly after, it dawned on me that this &lt;a href=&quot;https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=90894&quot;&gt;wasn&#039;t going to work&lt;/a&gt; and that I had a problem. &lt;br /&gt;&lt;br /&gt;You see, amongst other things, this server acts as the backup server for my laptop using &amp;quot;&lt;a href=&quot;http://backuppc.sourceforge.net/&quot;&gt;BackupPC&lt;/a&gt;&amp;quot;. BackupPC is quite simply an astounding piece of software, and suits my purpose absolutely perfectly. &lt;br /&gt;&lt;br /&gt;The problem lies in the way that BackupPC stores its&#039; backups on the filesystem - as it backs up files from clients, it MD5s the file and indexes the checksums, thereby identifying whether that file has been previously backed up.&lt;br /&gt;&lt;br /&gt;If the file has been backed up already, it knows from the MD5 hashed index that it doesn&#039;t need to make another copy, and so just creates a hard link to the original. &lt;br /&gt;&lt;br /&gt;And therein lies the problem... &lt;br /&gt;&lt;br /&gt;In the backup pool, there can be files with literally hundreds of hardlinks, which, dump/restore (even tar) doesn&#039;t handle very well at all.. Rather than a 1MB file, with a hundred links, there would be one hundred 1MB files after I had finished copying.&lt;br /&gt;&lt;br /&gt;It took me a while, but I spotted a previously unnoticed option to &lt;a href=&quot;http://www.samba.org/rsync/&quot;&gt;rsync&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;        -H, --hard-links            preserve hard links&lt;/blockquote&gt;&lt;br /&gt;Ideal - kudos once again to the rsync developers..&lt;br /&gt;&lt;br /&gt;Filesystem copied, my backups are preserved and the metadevice re-setup. Back in business.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;
 
    </content:encoded>

    <pubDate>Sat, 04 Mar 2006 12:39:25 -0500</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/8-guid.html</guid>
    
</item>
<item>
    <title>Solaris 10 in production.. finally.</title>
    <link>http://blog.hindsight.it/index.php?/archives/7-Solaris-10-in-production..-finally..html</link>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/7-Solaris-10-in-production..-finally..html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=7</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=7</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;&lt;br /&gt;
Something strange happened here today. I&#039;m not sure I get some of the reasons of why it happened.. partly Client company politics, I guess.&lt;/p&gt;&lt;p&gt;I&#039;ve been pushing the use of Solaris 10 quite heavily recently, and after trying to drum up enthusaism for a while, found an ideal project to make the first production machines.&lt;/p&gt; &lt;br /&gt;
&lt;p&gt;This was all well and good, and the techies that I&#039;d spoken to on the subject were keen.&lt;/p&gt;&lt;p&gt;However - it seemed that this move worried some management, and it took a meeting to drill down to what the concerns were.&lt;/p&gt;&lt;p&gt;Solaris 10 has had a lot of good publicity since its&#039; launch.. we all know it&#039;s got some great things in it (and some great things to come). However - with such huge advances over Solaris 9 - it&#039;s gotta be a whole heap different from the previous release, and therefore difficult to manage the transition, right?&lt;/p&gt;&lt;p&gt;In truth - from an operational perspective, there isn&#039;t much difference in a Solaris 10 build from its&#039; predecessor. &lt;/p&gt;&lt;p&gt;Yes, there are a whole bunch of interesting features out there to try out, but the vast majority of them are &lt;i&gt;optional&lt;/i&gt; - take for example, containers - it&#039;s a piece that I find intriguing, and lined with opportunities, but it&#039;s still developing... I&#039;m not yet convinced that I would advocate their use on an important production system.&lt;/p&gt;&lt;p&gt;Dtrace - a &lt;b&gt;great&lt;/b&gt; troubleshooting tool, which is going to be desparately useful, it&#039;s there if you want it. You can ignore it, and the system will run just fine.&lt;br /&gt; &lt;/p&gt;&lt;p&gt;There is however, one exception.. the SMF. This is the biggest change to the operational interface, and it&#039;s going to be the one that trips up those coming to Solaris 10 for the first time.&lt;/p&gt;&lt;p&gt;Goodness knows what the vendors are going to make of it.. hell, they &lt;strong&gt;still&lt;/strong&gt; can&#039;t get to grips with normal init.d/rc?.d scripts..&lt;/p&gt;&lt;p&gt;(Don&#039;t get me wrong - I like the SMF.. it&#039;s just going to be funny seeing a Solaris machine silently boot.... quickly).&lt;/p&gt;&lt;p /&gt; 
    </content:encoded>

    <pubDate>Mon, 13 Feb 2006 18:40:39 -0500</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/7-guid.html</guid>
    
</item>
<item>
    <title>Clones, Failovers and Migrations</title>
    <link>http://blog.hindsight.it/index.php?/archives/5-Clones,-Failovers-and-Migrations.html</link>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/5-Clones,-Failovers-and-Migrations.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=5</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=5</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
I&#039;ve recently finished my fourth article for SysAdmin magazine - this was a bit of a rush job, as they want to publish in the &amp;quot;Clusters&amp;quot; themed issue in April. It is a discussion of a technique for system failover that I recently put into production, which more-or-less guarantees transparency of failover.&lt;br /&gt;&lt;br /&gt;
 &lt;br /&gt;
&lt;p&gt;I&#039;ll not publish the full article online until after the magazine put it in print, however, here&#039;s the introduction.&lt;/p&gt;&lt;p /&gt;&lt;blockquote&gt;&lt;p&gt;Whilst working at a customer site, I recently had a requirement to design a mechanism whereby a very large legacy database server could be failed over from one site to another. &lt;/p&gt;&lt;p&gt;Over the years several methods of doing so had been investigated, and subsequently abandoned for various financial, technical and political reasons.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;Years after the initial service was implemented, the system was still a major single point of failure for the customer, and scheduled to still be in active duty for anything up to eighteen months.&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;During this time, the size and criticality of the system meant that the risk of implementing a creative solution was deemed to be inappropriate, until a single, significant event forced the client to reassess the situation.&lt;br /&gt;
The solution had to be constructed around the existing system, keeping change to a minimum, and absolutely avoiding any direct interference with the database and application.  &lt;br /&gt;
&lt;br /&gt;
&lt;/p&gt;&lt;p&gt;We had a healthy budget to implement, however, we felt that a competent solution could be delivered whilst utilising a fraction of the available funds.&lt;br /&gt;
&lt;/p&gt;&lt;/blockquote&gt; 
    </content:encoded>

    <pubDate>Tue, 31 Jan 2006 06:05:44 -0500</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/5-guid.html</guid>
    
</item>
<item>
    <title>SAME again</title>
    <link>http://blog.hindsight.it/index.php?/archives/3-SAME-again.html</link>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/3-SAME-again.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=3</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=3</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;Back onto the subject of striping, here&#039;s a thought that I&#039;ve harboured for a little while, but never actually gotten around to testing it formally. Perhaps sometime soon I&#039;ll find the time to give it a thorough test and analysis and report back..&lt;/p&gt;&lt;p&gt;Common knowledge tells us to take the average size of a read or write operation (dependant on whether an app is read- or write- mostly) and to divide it by the number of spindles that we are striping against in order to calculate the correct column width.&lt;/p&gt;&lt;p&gt;I&#039;m not so sure.&lt;/p&gt; &lt;br /&gt;The only way that this mythical &lt;i&gt;&lt;b&gt;average &lt;/b&gt;&lt;/i&gt;operation will utilise every spindle once is if (and &lt;b&gt;only&lt;/b&gt; if) it starts at block zero of the stripe.&lt;p&gt;Another&lt;br /&gt;
quick calculation here. A typical stripe width is 1MB (2048 blocks),&lt;br /&gt;
our chance of the operation starting at block zero is about 0.049%.&lt;/p&gt;&lt;p&gt;Let&#039;s also think about what happens when an &amp;quot;average&amp;quot; operation starts at block &lt;i&gt;n&lt;/i&gt;, (where &lt;i&gt;n&lt;/i&gt;&amp;gt;0).&lt;br /&gt;
The operation will clearly &amp;quot;wrap around&amp;quot; the stripe, and partially&lt;br /&gt;
write onto the very next column (which happens to be the same disk that&lt;br /&gt;
we started on)&lt;/p&gt;&lt;p&gt;There is a relatively simple solution, and that is how I belive we &lt;b&gt;&lt;i&gt;ought&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;
to be calculating the column widths: Take the size of the &amp;quot;average&amp;quot;&lt;br /&gt;
operation, divide by the number of spindles/LUNs (and here&#039;s the&lt;br /&gt;
important bit).. minus one.&lt;/p&gt;&lt;p&gt;So for example, we have an average operation of 1MB, and eight spindles. We can calculate our desired column width by:&lt;/p&gt;&lt;p&gt;    1024KB / (8-1) ~= 146KB&lt;/p&gt;&lt;p&gt;Conventional wisdom would have placed this at:&lt;/p&gt;&lt;p&gt;    1024KB / 8 = 128KB&lt;/p&gt;&lt;p&gt;The&lt;br /&gt;
next question is likely to be &amp;quot;will this have a major impact?&amp;quot;, and&lt;br /&gt;
therefore &amp;quot;do we care?&amp;quot;. Not sure to be honest. I&#039;d like to find the&lt;br /&gt;
time to fire up Dtrace for a proper test. It seems likely that using&lt;br /&gt;
the larger column width ought to reduce the number of operations, and&lt;br /&gt;
possibly then reduce contention.&lt;/p&gt;&lt;br /&gt;
 
    </content:encoded>

    <pubDate>Thu, 26 Jan 2006 11:33:00 -0500</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/3-guid.html</guid>
    
</item>
<item>
    <title>Another closure, another customer gone..</title>
    <link>http://blog.hindsight.it/index.php?/archives/4-Another-closure,-another-customer-gone...html</link>
    
    <comments>http://blog.hindsight.it/index.php?/archives/4-Another-closure,-another-customer-gone...html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=4</wfw:comment>

    <slash:comments>2</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=4</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
With the &lt;a href=&quot;http://news.bbc.co.uk/1/hi/scotland/4642700.stm&quot;&gt;recent news&lt;/a&gt; that &lt;a href=&quot;http://www.lexmark.com&quot;&gt;Lexmark&lt;/a&gt; is closing it&#039;s inkjet cart manufaturing facility in Rosyth, I&#039;ll be losing a good customer. &lt;br /&gt;
 &lt;br /&gt;
&lt;p&gt;It would seem that the plant has closed for economic reasons - presumably the cost of production per unit is too high. This is likely to be due to a number of things - falling prices of laser printers, increased competition between the inkjet refill suppliers, but the gaze of suspicion has to look at the manpower costs that is incurred by running a workforce in a so-called first world country.&lt;/p&gt;&lt;p&gt;The company also has major production facilities in the Philipines and Mexico, presumably which will be taking up the Scottish workload.&lt;/p&gt;&lt;p&gt;It would appear that the large corporates are taking an increasingly abstract view on their global locations, concentrating on the bottom line, and to be honest, can we really blame them? I recently shopped around to purchase a consumer electrical product at the cheapest price. In many ways, they are doing just the same, just on a rather grander scale. &lt;/p&gt;&lt;p&gt;Where will this lead us though? - I&#039;m no financial or economic expert, but surely this trend is unsustainable in the long term... &lt;/p&gt;&lt;p /&gt; 
    </content:encoded>

    <pubDate>Wed, 25 Jan 2006 03:15:21 -0500</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/4-guid.html</guid>
    
</item>
<item>
    <title>SAME difference</title>
    <link>http://blog.hindsight.it/index.php?/archives/2-SAME-difference.html</link>
            <category>Oracle</category>
            <category>Solaris</category>
    
    <comments>http://blog.hindsight.it/index.php?/archives/2-SAME-difference.html#comments</comments>
    <wfw:comment>http://blog.hindsight.it/wfwcomment.php?cid=2</wfw:comment>

    <slash:comments>1</slash:comments>
    <wfw:commentRss>http://blog.hindsight.it/rss.php?version=2.0&amp;type=comments&amp;cid=2</wfw:commentRss>
    

    <author>nospam@example.com (Mike Scott)</author>
    <content:encoded>
    &lt;br /&gt;
&lt;p&gt;I was chatting with Doug recently about disk layouts, and the conversation rolled onto disk and volume topologies and layouts. &lt;/p&gt;&lt;p&gt;Where I&#039;m working at the moment, there has been a myth propogated that striping is good for data filesystems, and concats are better for redo logs.&lt;/p&gt;&lt;p&gt;Quite where this has come from, I&#039;m not entirely sure. However.. whilst thinking about this I came to the conclusion that for redo logs, the layout is largely irrelevant.&lt;/p&gt;&lt;a href=&quot;http://www.oracle.com/technology/deploy/availability/pdf/oow2000_same.pdf&quot;&gt;&lt;/a&gt; &lt;br /&gt;
Assuming that is, if you are striping, you are using a reasonably large column width, and you&#039;re not trying to RAID5 the redo log, which would likely result in poor performance unless accelerated by some write caching controller. The thinking went something along these lines:&lt;br /&gt;&lt;br /&gt;The theory is that the size of the redo log write will be very small (a matter of bytes or, at the most a couple of kilobytes, I&#039;d imagine (it&#039;s been a while since I&#039;ve actually traced any of this through to physical I/O - so this is a largely &lt;i&gt;&lt;b&gt;theoretical&lt;/b&gt;&lt;/i&gt; argument) so let&#039;s assume for the time being that an &amp;quot;average&amp;quot; redo write is 1KB, and we have a striped layout with a column width of 256KB (this was based on a specific example), the chance of a redo log write spanning two columns (and therefore requiring two separate physical write ops instead of one) is going to be:&lt;br /&gt;&lt;br /&gt;KB in a column= 256&lt;br /&gt;Bytes in a column= 256*1024= 262144&lt;br /&gt;&lt;br /&gt;For a two block write (2* 512bytes) to span two columns, it would require to start at least (512*2-1) bytes from the end of the column (=1023 bytes)&lt;br /&gt;&lt;p&gt;The probability therefore of a redo log write requiring two separate write operations = 1024/262144= 0.0039% (and even then, the two writes are on distict spindles, possibly even different controllers, and will probably happen in parallel).&lt;/p&gt;Based on the aforementioned assumptions - that sort of probabilities I can live with, and so can the DBA team. So, whilst we are striping the data volumes, it makes good sense to also stripe the redo logs. Keep it the &lt;a title=&quot;Oracle SAME White Paper&quot; href=&quot;http://www.oracle.com/technology/deploy/availability/pdf/oow2000_same.pdf&quot;&gt;SAME&lt;/a&gt;.  
    </content:encoded>

    <pubDate>Tue, 24 Jan 2006 03:33:00 -0500</pubDate>
    <guid isPermaLink="false">http://blog.hindsight.it/index.php?/archives/2-guid.html</guid>
    
</item>

</channel>
</rss>