A number of customer's have been reporting a number of increased connections to their EMC Centera Devices.
I thought I’d investigate a little further, I wrote a test app using the EMC Centera SDK, spun up 10 threads using the same poolref
C:\tools>RetrieveContent.exe Enter the IP address or DNS name of the cluster(s): 10.14.96.21,10.14.96.22,10.14.96.23,10.14.96.24 [10.14.96.21] Opened Pool, using poolref: 348777130463956 Press ENTER to continue. Opened Clip:BLGE3TDLPU5FPe2VCP2NS8BOHV3G4185NGU78J0H2I8N0JK09FF2S Opened Clip:16995546G1S78e0J6C74GH4F9D5G4185NH3TM609CDUCRNH56LDFP Opened Clip:9CJLJ21FFI1G5eA9HDEQ07JENT5G4185NML7760DB1V1KQ5OMCKL9 Opened Clip:CUVD1C17HQHI7e1LF2HJITQT4DEG418FA7MIAA0R4GRLR7MQEEFK8 Opened Clip:0SDHQGQ2C5U26eAQ9G4R6JURLJRG418FENIELL0R5AFE52NKMJN3Q Opened Clip:BP4CJJ66S6794e34PHJ2GFURC0AG418FENKNNE0P0ALMKSKHE08J6 Opened Clip:47BSSCJKJ0VLFeFVL4RP4TN49U0G418FEOEQLQ0R1V5RIBSIFO8TA Opened Clip:4PGHG674NIEK0e9298KBEPIM86HG418FEOH3NH0G4EBQIKSDN8U0C Opened Clip:8A8LF6D0C32LGeFSNHNLIR2T76LG418FEOL7GP0LF2UK5PAJ4QLPB Opened Clip:FN5IOVTVGEPJ1eECMM7VU7ND3OKG418FEOT0LS0PDDHGFC9JB0IH6 and pausing.... Press ENTER to continue
I have 4 Access Nodes/Roles (AN), and the EMC SDK looks like it opens one connection per AN:
I noticed I had quite a few StorageFileWatch connections, so using a Dtrace filter of poolref:
25,659 14:53:56.843 [5,692] (StorageFileWatch) <7912> EV:M CPools::Open (Increment) -- Connection string: 10.14.96.21, PoolRef: 342777130463912, Usage count: 10 25,670 14:53:56.865 [5,692] (StorageFileWatch) <7912> EV:M CPools::Open (Increment) -- Connection string: 10.14.97.61,10.14.97.62,10.14.97.63,10.14.97.60, PoolRef: 350233193691504, Us 26,009 14:53:56.911 [5,692] (StorageFileWatch) <7492> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,027 14:53:56.935 [5,692] (StorageFileWatch) <7336> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,059 14:53:56.951 [5,692] (StorageFileWatch) <5732> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,080 14:53:56.976 [5,692] (StorageFileWatch) <7904> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,098 14:53:57.011 [5,692] (StorageFileWatch) <7744> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,116 14:53:57.035 [5,692] (StorageFileWatch) <5424> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,288 14:53:57.067 [5,692] (StorageFileWatch) <7912> EV:M CPools::Close (Decrement) -- Connection string: 10.14.96.21, PoolRef: 342777130463912, Usage count: 9 26,293 14:53:57.082 [5,692] (StorageFileWatch) <7912> EV:M CPools::Close -- Enable normal timeout -- Connection string: 10.14.97.61,10.14.97.62,10.14.97.63,10.14.97.60, PoolRef: 35023 26,466 14:54:56.868 [5,692] (StorageFileWatch) <5632> EV:M CPools::Open (Increment) -- Connection string: 10.14.96.21, PoolRef: 342777130463912, Usage count: 10 26,477 14:54:56.889 [5,692] (StorageFileWatch) <5632> EV:M CPools::Open (Increment) -- Connection string: 10.14.97.61,10.14.97.62,10.14.97.63,10.14.97.60, PoolRef: 350233193691504, Us 26,809 14:54:56.925 [5,692] (StorageFileWatch) <7492> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,827 14:54:56.949 [5,692] (StorageFileWatch) <7264> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,845 14:54:56.974 [5,692] (StorageFileWatch) <5168> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,884 14:54:57.000 [5,692] (StorageFileWatch) <4680> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,902 14:54:57.023 [5,692] (StorageFileWatch) <7336> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 26,926 14:54:57.047 [5,692] (StorageFileWatch) <6136> EV:M CVaultStoreEMCCentera::PoolOpen - Using existing PoolRef: 342777130463912 27,095 14:54:57.082 [5,692] (StorageFileWatch) <5632> EV:M CPools::Close (Decrement) -- Connection string: 10.14.96.21, PoolRef: 342777130463912, Usage count: 9 27,100 14:54:57.096 [5,692] (StorageFileWatch) <5632> EV:M CPools::Close -- Enable normal timeout -- Connection string: 10.14.97.61,10.14.97.62,10.14.97.63,10.14.97.60, PoolRef: 35023
I see two poolref’s as per my connection list:
select PartitionName,IPAddressList from PartitionEntry
ggg Ptn4 10.14.96.21,10.14.97.61
ggg Ptn9 10.14.96.21
So in summary an EV Server can have multiple processes (StorageCrawler, StorageFileWatch, StorageArchive, MigratorServer etc.) that connect to a Centera. If all connections within a single process use the same connection string we will open one PoolRef. Therefore multiple threads within a single process will share the same PoolRef. PoolRef sharing can only occur within a single process.
Under the covers the EMC SDK may share connections as we use the FP_LAZY_OPEN for FPPool_Open() which opens connections to addresses only as needed (as per output below).
[StorageFileWatch.exe] TCP 192.168.75.122:63895 10.14.96.22:3218 ESTABLISHED 7880 [RetrieveContent.exe] TCP 192.168.75.122:63896 10.14.96.24:3218 ESTABLISHED 7880 [RetrieveContent.exe] TCP 192.168.75.122:63897 10.14.96.24:3218 ESTABLISHED 4544 [RetrieveContent.exe] TCP 192.168.75.122:63898 10.14.96.23:3218 ESTABLISHED 4544 [RetrieveContent.exe] TCP 192.168.75.122:63899 10.14.96.21:3218 ESTABLISHED 4544 [RetrieveContent.exe] C:\tools>tasklist|find /i "retri" RetrieveContent.exe 7880 Console 1 8,360 K RetrieveContent.exe 4544 Console 1 8,540 K
In EV10, a number (1+) of StorageCrawler processes (default of 10) will be maintained on each Storage server to handle indexing requests. Processes are only launched on a ‘need to’ basis so it is possible that none will be running if no work is being requested from that particular Storage Server. This model will boost StorageCrawler’s ability to cope with 64-bit demand from multiple Indexing servers and reduces the existing single point of failure.
So if you are finding a high number of connections use Netstat (-anob, look for destination port of the default 3218) to determine which processes have open connections and throttle if need be.