Click here to Skip to main content
Click here to Skip to main content
Go to top

How to Retrieve EMC Centera Cluster/Pool Capabilities

, 19 Oct 2007
Rate this:
Please Sign up or sign in to vote.
This article shows you how to connect to a Centera Storage appliance and get the Centera Cluster Capabilities.
Screenshot - image013.png

How to Write to Centera Storage Appliances: Introduction

This article is part of a series of articles that I am writing to illustrate the use of the EMC Centera SDK and the .NET wrapper being developed as an open source project to store "fixed content" on the EMC Centera storage appliance. Before I start, I'd like to explain what "fixed content" is and give an overview of the reasoning behind the emergence of this type of storage.

Fixed Content Definition

Fixed content is information that never changes after creation. It's actively referenced, typically shared among users and must be retained (maintaining a copy of fixed content for a mandatory period of time) for a long period of time. Examples include: electronic documents, presentations and e-books; rich media such as movies, videos, digital photographs and audio files; check images and financial statements; bioinformatics, X-rays, MRIs and CAT scans; CAD/CAM diagrams and blueprints and e-mail messages.

Examples of "Fixed Content"

  • An average enterprise (a 250-person organization) generates approximately 1.5 TB of e-mails per year.
  • A picture archive in a large hospital may generate more than 5 TB per year in digital X-rays or MRIs.
  • Banks are scanning millions of check images per year, requiring multiple terabytes of storage.

State of the Industry

A large portion of all digital information is fixed content. It is estimated that fixed content will be the largest portion of digital content created by the human race in the next century, exceeding all dynamic content put together.

Screenshot - image001.jpg

Also, the information life cycle drives towards more fixed content. Enterprises embracing things like e-mail and electronic documents are increasing the need for fixed content storage exponentially. Finally, emerging regulations requiring retention (maintaining a copy of fixed content for a mandatory period of time) in the financial and healthcare industries are creating a huge need for fixed content storage and fixed content solutions.

The EMC Centera appliance is one of the appliances that are available in the market today to satisfy that need. Other companies like NETApp have solutions equivalent to Centera's, but this series of articles is specific to showing how to code using the Centera SDK.

What You Will Need to be Able to Develop Against the Appliance

  1. To start writing content to the Centera appliance, you will need to have the Centera SDK. You will need to register on the EMC site to download the SDK. There are a number of versions of the SDK available for download. Use the 3.1SP1 version. Click here to download the SDK. Note that the only way to save content on most "fixed content" storage devices is through the use of the device propriety API(s) that the manufacturer of the device publishes. Some manufactures do offer an open standard (CIFS, NFS, HTTP and WebDAV interfaces) to read/write to their own devices. Usually, however, you end up losing a lot of the device power. Things like WORM (write-once-read-many) functionality or retention capabilities are usually lost with the open standards.

  2. You will also need the .NET wrapper for the Centera SDK. The latest version of the open source .NET project is on SourceForge here.

  3. You need to have access to the "Public Centera" appliances. EMC recognized that the Centera device is not available everywhere and so set up an appliance on the internet that developers can develop against. The content of this appliance is purged periodically by EMC. The latest IP(s) can be found on EMC site. As of this writing, the valid IP(s) are:

    • EMEA1 - 152.62.65.11, 152.62.65.12, 152.62.65.13, 152.62.65.14
    • EMEA2 - 152.62.65.16, 152.62.65.17, 152.62.65.18, 152.62.65.19
    • EMEA3 - 212.3.248.41, 212.3.248.42, 212.3.248.43, 212.3.248.44
    • EMEA4 - 212.3.248.46, 212.3.248.47
    • EMEA5 - 152.62.65.21, 152.62.65.22
    • US1 - 128.221.200.56, 128.221.200.57, 128.221.200.58, 128.221.200.59
    • US2 - 128.221.200.60, 128.221.200.61, 128.221.200.62, 128.221.200.63
    • US3 - 128.221.200.64, 128.221.200.65, 128.221.200.66, 128.221.200.67
    • US4 - 128.221.200.116, 128.221.200.117, 128.221.200.118, 128.221.200.119
    • US5 - 128.221.200.120, 128.221.200.121, 128.221.200.122, 128.221.200.123

Special Architecture Knowledge You Need

  1. The Centera appliance stores content. This content is stored using an address. This content/address combination is called CAS or "content addressable storage." So, you will hear/read about this term in the industry these days.

  2. The smallest block of data that can be stored must be housed inside a memory block that the SDK calls "C-Clip." In other words, you have to create a C-Clip and place your content inside the C-Clip first. Then you send the C-Clip to the Centera to be saved. The C-Clip itself is made of 2 other components, the Content Descriptor File -- or CDF for short -- and the BLOB.

  3. The Content Descriptor File is an XML file that holds metadata. The CDF contains TAGS and ATTRIBUTES.

    Tag

    • An XML tag in the CDF
    • A user-defined name

    Example: <Application_Name>ImageStore2004</Application_Name>

    Attribute

    • An XML attribute in the CDF
    • A user-defined value

    Example: <My_App name= "ImageStoreServer"/>

  4. The C-Clip also holds a BLOB. The BLOB is usually the content you want to store. BLOBs have the following characteristics:

    • They hold objects stored on Centera
    • They are represented as distinct bit sequences of the objects you are trying to store
    Screenshot - image002.jpg
  5. Centera runs an OS called "CenteraStar." This OS is optimized for writing and reading the C-Clip objects.

  6. Centera objects have metadata. The applications you develop create metadata associated with one or more objects. These objects are stored independently of volume/directory information, as in the image below:

    Screenshot - image003.jpg
  7. Overall process overview:

    Screenshot - image004.jpg

Centera: Three Modes

Basic Mode

Centera acts like a standard magnetic storage. An object marked for deletion is deleted immediately.

Compliance Mode

Active retention protection ensures availability of objects for a configurable period of time. An object marked for deletion is not deleted until the retention period passes.

Compliance Plus Mode

Similar to compliance mode, compliance plus mode uses retention periods. The default retention period is infinite. Unlike compliance mode, data never purges.

Benefits of the Compliance Modes

  • Retention enforcement
    1. Retention is set on the clip; applies to all BLOBs that are referenced by the clip
    2. Cannot delete a clip/BLOB when retention has not expired
    3. Once retention expires, clip is eligible for deletion
  • Data deletion enhancements: shredding
    1. Overwrites data multiple times with a random bit pattern
Screenshot - image005.jpg

Centera SDK

The Centera-supplied Software Development Kit (SDK) contains:

  • C callable libraries
  • Java interface that utilizes JNI
  • Documentation
  • Sample code

It can be downloaded here. You will need to create an account with EMC to be able to download the full SDK.

Why Is It needed?

  • Provides content-addressing framework
  • No file system and associated drawbacks
  • Applications access the Centera via API calls only

Centera Cluster

A cluster is a logical CAS archive that appears to your application as a single unit. A cluster can be accessed by one or more applications via a set of node IP addresses and access profiles.

Screenshot - image006.jpg

Pool

A pool is an SDK object that represents one or more clusters. Your application must OPEN a pool by providing a series of node IP addresses and access profile credentials for the desired set of clusters. The first accessible IP address in the list represents the primary cluster, while subsequent IP addresses are considered the secondary clusters (assuming that they represent distinct clusters). The pool object also auto-discovers any replica clusters that are configured via the primary or secondary clusters.

Profiles

The system administrator creates access profiles to applications. Profiles are a means to enforce authentications and authorization. The system administrator can determine which applications have access to a cluster and what operations they can perform. An application can only log into Centera if a profile for that application has been created on the Centera cluster and the credentials for that profile have been made available to the application server.

Once the profiles have been created on the Centera cluster, the system administrator exports the profile information to a Pool Entry Authorization (PEA) file and copies this file to the application server. The system administrator can set an environment variable that points to the PEA file or can leave it to the application to give the path to this file.

When you code your application, you can ignore the PEA file and the cluster will point the SDK to the location of the PEA file to use. Alternatively, as a developer, your enterprise may have created specific PEA files and distributed them to the development team. At this point, you can give the full path of the PEA file in your code when opening the pool. It is important to note that for these articles, the public available PEA profiles will be used. The files have the following naming convention: ClusterName_ProfileName_CapabilitiesList.pea. For example, us2_armTest2_rdqeDcwh.pea translates to:

  • Application Profile belongs to Centera Cluster US2
  • Profile Test2, Advanced Retention Management (arm) enabled
  • Capabilities: all enabled – please refer to the list below

Capabilities Definitions

  • r: read
  • w: write
  • d: delete
  • q: query
  • e: exists
  • D: privileged delete
  • c: clip copy
  • h: retention hold

All profiles except "Profile1" are configured to enable the "monitor" capability. Each profile also comes enabled with a name/secret combination that corresponds to the profile name. Thus, to access a profile defined by the us2_armTest2_rdqeDcwh.pea file, the application could alternatively use "name=armTest2,secret=armTest2" in the connect string.

Conclusion

And as Forrest Gump said in the drama movie with the same name, "That's all I am going to say about that." This introduction should give you enough knowledge to be able to read the SDK and write code to use the Centera appliance. Since this article is one in a series of articles I am writing about different functionalities, each individual article will have this introduction and then will discuss the specific Centera functionality the article will address.

How to Set Up the Development Environment

  1. In Visual Studio, create a new project called "AdrdProjectCentera1" as in the figure below:

    Screenshot - image007.png

    Note: I am creating the project on my E: drive in the CAS directory. The project name in this article is "AdrdProjectCentera1." This will create the directory structure needed by Visual Studio. The directory of interest in this solution structure is the debug directory that gets created by Visual Studio. In this article, the full path of the directory of interest is as follows: E:\CAS\AdrdProjectCentera1\AdrdProjectCentera1\AdrdProjectCentera1\bin\Debug. Note that your path will be different depending on the location of your project.

  2. The next step is to unzip the EMC Centera SDK files. The SDK is delivered from the EMC site as a single zipped file. The default zip file name is 3[1].1_SDK_Windows_gcc.zip as of Oct 13, 2007. Once the file is unzipped, a number of directories will be created. Copy the files in the lib directory to the debug directory created by Visual Studio in step 1.

    The files that you will copy are FPLibrary.dll, fpos32.dll, fpparser.dll and pai_module.dll. There is also an FPLibrary.jar file that exists in that lib directory. You do not need to copy that file. The FPLibrary.jar file is the Java wrapper for FPLibrary.dll. This JAR is the equivalent of the .NET wrapper that the SourceForge project is all about. Also, all the LIB files are to be used if you are developing using C or C++. Just ignore these files for this article.

  3. Next, download all the PEA files to be able to develop against the "public Centera." I will use the "US X" PEA files from the EMC website here as of Oct 13th, 2007. Make sure you copy the PEA files to the debug directory described in step 1 above.

  4. The next step would be to unzip the .NET wrapper you downloaded from the SourceForge site. The default zipped file that you downloaded would be FPApi.NET.zip. Once it is fully unzipped, the following directories will be created:

    Screenshot - image008.png
  5. The ZIP file from SourceForge does not include the binary file of the wrapper (compiled version of the code). So, you will need to compile the code to generate the final wrapper that you will use in this article project. To do so, double click on Wapper.sln, which the ZIP file extraction created. This should start a new instance of Visual Studio and the solution should look as follows:

    Screenshot - image009.png
  6. Compile the solution by selecting the "Build" -->"Build Solution" menu options, as in the next figure:

    Screenshot - image010.png
  7. Once the build is complete, copy the files FPSDK.dll and FPSDK.pdb that are generated as a result of the solution build to the debug directory created in step 1.

  8. The final debug directory for the solution should look like this:

    Screenshot - image011.png
  9. The final step is to set a reference to FPSDK.DLL in your solution. To do so, open the original solution you created in step 1 (if it is not already open).

    Screenshot - image012.png

Finally, the Article Content: How to Retrieve the Centera Cluster Capabilities

The following screen shot is this article's UI.

Screenshot - image013.png

To actually retrieve the cluster information, you need to make the following API calls:

  1. Open the Centera cluster by creating an instance of the wrapper FPPool object.
  2. Use the FPPool instance you created in the step above to retrieve the cluster capabilities.
  3. Close the FPPool instance.

Open the Pool

To open the Centera pool, you will need the cluster "Connection String." This is usually an IP address if a single Centera, or a number of IP(s) if Centera is configured as a cluster separated by commas. Also, concatenated to the IP list a "?" sign and the the full path of the PEA file. In the code associated with this article, the PEA files are included in the debug directory.

Sample of the Connection String

128.221.200.56?us1_profile1_rwqe.pea

Or

128.221.200.56, 128.221.200.57, 128.221.200.58, 128.221.200.59?us1_profile1_rwqe.pea

Retrieve the Cluster Capabilities

#region Build the String to display in the UI

strPoolInfo = ("\nPool Information" + "\n================" +

"\nCluster ID:                   " + myPool.ClusterID +

"\nCluster Time:                 " + myPool.ClusterTime +

"\nCluster Name:                 " + myPool.ClusterName +

"\nCentraStar software version:  " + myPool.CentraStarVersion +

"\nSDK version:                  " + FPPool.SDKVersion +

"\nCluster Capacity (Bytes):     " + myPool.Capacity +

"\nCluster Free Space (Bytes):   " + myPool.FreeSpace +

"\nCluster BlobNamingSchemes :   " + myPool.BlobNamingSchemes +

"\nCluster Capacity:             " + myPool.Capacity.ToString() +

"\nCluster CenteraEdition:       " + myPool.CenteraEdition +

"\nCluster ClipBufferSize:       " + myPool.ClipBufferSize.ToString() +

"\nCluster DeleteAllowed:        " + myPool.DeleteAllowed.ToString() +

"\nCluster DeletionsLogged:      " + myPool.DeletionsLogged.ToString() +

"\nCluster ExistsAllowed:        " + myPool.ExistsAllowed.ToString() +

"\nCluster QueryAllowed:         " + myPool.QueryAllowed.ToString() +

"\nCluster RetentionDefault:     " + myPool.RetentionDefault.ToString()+

"\nCluster ReadAllowed:          " + myPool.ReadAllowed.ToString()+

"\nCluster WriteAllowed:         " + myPool.WriteAllowed.ToString());

#endregion

Close the FPPool

In the sample included, I have opened the pool inside a using statement. Therefore, when done, the FPPool will be closed. It is possible to use the following statement:

myPool.Close();

Explaining the Capabilities

  • ClusterID: Unique ID of the cluster
  • ClusterTime: Time on the cluster; note that all Centera maintain GMT time
  • ClusterName: The name given to the cluster; most of the time, this value is never used or filled by the Centera administrators
  • CentraStarVersion: the version of the OS running on Centera
  • SDKVersion: The version of the SDK your application is using; usually it is the version you downloaded from EMC, but note that newer versions of the SDK can talk to earlier versions of the CenteraStar OS
  • Capacity: Total space on the Centera pool you are connecting to
  • FreeSpace: Total available space on the Centera pool you are connecting to
  • CenteraEdition: Is either basic, CE or CE+. Please see the Centera Modes section earlier in this article
  • DeleteAllowed: Is deletion of clips allowed on this pool
  • DeletionsLogged: Is deletion logged; usually this is set to true for auditing purposes and especially if the pool/cluster is in basic mode
  • RetentionDefault: The default retention period; most of the public Centera clusters have this value set to 00:00:00, which implies that there is no retention. In other words, C-Clips can be deleted immediately

For all other capabilities, please see the Centera API reference GUID Centera_SDK_3.1_API_Ref_Guide.pdf and review the FPPool_GetCapability API. Also included in the demo code are 2 classes that are used to serialize the capabilities. The classes are named AdrdCenteraClusterInfoItem and AdrdCenteraRetentionInfoItem, respectively. These classes represent most of the capabilities that you will ever use when developing with Centera. I will use them in my next 2 articles on how to write to Centera and how to read from Centera.

Background

You can get the Microsoft or a PDF version of this article from here.

History

  • 19 October, 2007 -- Original version posted

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

AdelArch
Systems Engineer
United States United States
Adel Eddin is a software architect evangelist with extensive experience in planning, developing, and implementing large information technology solutions in the financial and public sectors.

Comments and Discussions

 
Generalexception error PinmemberJohnny Chu23-Sep-10 6:04 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140916.1 | Last Updated 19 Oct 2007
Article Copyright 2007 by AdelArch
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid