Was AppFabric error reporting developed by an Intern?

Was AppFabric error reporting developed by an Intern?

 posted on 05:24:36 pm by Frank in Dev

There are a few things that can set me really mad in life.  One of them is bad code and bad coding practice.  Today, I am in the final stage of implementing a global cache eviction strategy for our RIA architecture using Microsoft Windows AppFabric and I am getting a really weird error that I have been getting for quite some time and that always bugged me.  Here the error:

AppFabric cache server generated an error: ErrorCode<ERRCA0017>:SubStatus<ES0001>:There is a temporary failure. Please retry later.

First observation is that this error message comes really thin.  Sounds to me like like someone at the end of his intership at Microsoft had to develop the error handling and had to cut it short because his intership was over...  Quite frankly, this is the best example of what you should NOT do when reporting error back to clients.  I won't bore you with the details here, if someone ever read this blog post and is interested to discuss why, I'll be glad to explain.  I am sure it is clear to any seasonned developer why this message sucks.

Just so you know, I get this error when calling the Get method on a DataCache that exists and works for different regions in my AppFabric wrapper:

 

var cache = this.cacheFactory.GetCache(CacheCategoryType.Revisions.ToString());

cacheRevisions = (Dictionary<CacheRevisionItemKeyValuePair, CacheRevisionItem>)cache.Get(ServerCacheSystemItemKeys.CacheRevisions.FullNameToString(), cacheNamespace);

 

In fairness, I'll tell you what I thought the problem was before I explain the 2 lines of code.  The problem seems simple on the surface because I call cache.Get on a non existent Region.

With that said, the code above just gets a cache where we store all of custom revisions for all of our sql server databases.  In that cache, we have a region by SQL Database so we can segregate stuff; we host several hundred SQL databases.  Problem is, sometimes the region might not exists because nobody connected to the specific database since the last time the cache was cleared.

So what do we do next?  Of course, we simply open .Net Reflector in order to determine what is the true problem.  By looking at the Get method, we see right away that this code is weird:

private object InternalGet(string key, out DataCacheItemVersion version, string region)

{

if ((region != null) && RegionNameProvider.IsSystemRegion(region))

{

throw new ArgumentException(GlobalResourceLoader.GetString(CultureInfo.CurrentUICulture, "NotPermittedForDefaultRegion"), "region");

}

version = null;

RequestBody reqMsg = new RequestBody(ReqType.GET);

reqMsg.RegionName = region;

reqMsg.CacheName = this._myName;

reqMsg.Key = new Key(key);

reqMsg.Version = InternalCacheItemVersion.Null;

object valObject = null;

ResponseBody respBody = this.SendReceive(reqMsg);

if (respBody.Ack == AckNack.Ack)

{

valObject = respBody.ValObject;

version = new DataCacheItemVersion(respBody.Version);

return valObject;

}

if ((respBody.ResponseCode != ErrStatus.KEY_DOES_NOT_EXIST) && (respBody.ResponseCode != ErrStatus.REGION_DOES_NOT_EXIST))

{

ThrowException(respBody);

}

return valObject;

}

 

Say what?  If the ResponseCode is different than KEY_DOES_NOT_EXIST and different than REGION_DOES_NOT_EXIST report the error back?

 

Let's look at the less than capable ThrowException method to find out why we're getting an error code 17  with a substatus of 1 and what it could really mean:

 

private static void ThrowException(ResponseBody respBody)

{

int num;

int substatus = -1;

switch (respBody.ResponseCode)

{

case ErrStatus.INTERNAL_ERROR:

case ErrStatus.REPLICATION_FAILED:

case ErrStatus.REGIONID_NOT_FOUND:

num = 0x11;

break;

 

case ErrStatus.INVALID_REGION:

case ErrStatus.INVALID_CACHE:

num = 3;

break;

 

case ErrStatus.NO_WRITE_QUORUM:

substatus = 2;

num = 0x11;

break;

 

case ErrStatus.REGION_ALREADY_EXISTS:

num = 7;

break;

 

case ErrStatus.TIMEOUT:

num = 0x12;

break;

 

case ErrStatus.REGION_DOES_NOT_EXIST:

num = 5;

break;

 

case ErrStatus.VERSION_MISMATCH:

num = 1;

break;

 

case ErrStatus.KEY_ALREADY_EXISTS:

num = 8;

break;

 

case ErrStatus.KEY_DOES_NOT_EXIST:

num = 6;

break;

 

case ErrStatus.NAMED_CACHE_DOES_NOT_EXIST:

num = 9;

break;

 

case ErrStatus.MAX_NAMED_CACHE_COUNT_EXCEEDED:

num = 10;

break;

 

case ErrStatus.OBJECT_LOCKED:

num = 11;

break;

 

case ErrStatus.OBJECT_NOT_LOCKED:

num = 12;

break;

 

case ErrStatus.INVALID_LOCK:

num = 13;

break;

 

case ErrStatus.INVALID_ENUMERATOR:

num = 14;

break;

 

case ErrStatus.OUT_OF_MEMORY:

num = 0x16;

break;

 

case ErrStatus.SERVER_DEAD:

substatus = 5;

num = 0x11;

break;

 

case ErrStatus.REPLICATION_QUEUE_FULL:

substatus = 3;

num = 0x11;

break;

 

case ErrStatus.KEY_LATCHED:

substatus = 4;

num = 0x11;

break;

 

case ErrStatus.CLIENT_SERVER_VERSION_MISMATCH:

num = 0x13;

break;

 

 

case ErrStatus.NOT_PRIMARY:

substatus = 1;

num = 0x11;

break;

 

case ErrStatus.CONNECTION_TERMINATED:

num = 0x10;

break;

 

case ErrStatus.THROTTLED:

substatus = 6;

num = 0x11;

break;

 

default:

num = 4;

break;

}

throw NewException(num, substatus);

}

 

The answer seems to lie here:

 

case ErrStatus.NOT_PRIMARY:

substatus = 1;

num = 0x11;

break;

 

You're kidding right?  Status 17 with substatus of 1 means NOT_PRIMARY.  I guess we're not really much further than we were a few minutes ago...

 

Ok, this is where you start looking for log files.  Fortunately, the answer lies here: http://msdn.microsoft.com/en-us/library/ff921010.aspx.  In a nutshell, you access the Log by following these simple steps:

Open the Event Viewer on a cache host. For instructions on how to launch the Event Viewer, see Start Event Viewer.

- In the left navigation pane, expand the Applications and Services Logs folder.

- Then expand Microsoft, Windows, and Application Server-System Services.

- Select the Admin log

 

In there, I simply found out that my user didn't have access to the god damm server.  "Server channel security authorization failed for client {domain\user}. To grant access, use Grant-CacheAllowedClientAccount command".  All things not being created equals, we can see that the logging feature of AppFabric is quite useful compared to the error reporting junk we looked at above..

 

 

1 comment

Comment from: david lacerte [Visitor]
david lacerte

God… that pretty much sucks, I admit.

10/31/11 @ 11:22
May 2016
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31        
Francois Germain is going to write his thoughts on software architecture on the Microsoft platform

Search

  XML Feeds

powered by b2evolution

©2016 by Frank • ContactHelpb2evo skin by FrançoisevoCorevps hostingFrançois