This issue in a nutshell:
A block blob can be created with a single PUT request. This will create a blob with committed content but the blob will not have any committed blocks!
This means that you cannot assume that the concatenation of committed blocks is the same as the committed content.
When working with block blobs you'll have to pay extra attention to blobs with empty block lists, because such blobs may or may not be empty!
The original question:
One of our storage blobs in an Azure account has an empty block list, although it is non-empty.
I'm retrieving the block list like this (C#):
foreach (var block in _cloudBlob.DownloadBlockList(
BlockListingFilter.Committed,
AccessCondition.GenerateLeaseCondition(_leaseId)))
{
// ...
}
The code in the foreach
block is NOT executed. The returned list is empty.
However, the blob reports that it has a non-zero length when I check: _cloudBlob.Properties.Length
I can also download the blob and see that it is not empty.
Am I missing something? How can the block list be empty when the blob is not?!
It does not matter whether I use BlockListingFilter.Committed
, BlockListingFilter.Uncommitted
or BlockListingFilter.All
; the list is still empty!
UPDATE
I have copied this blob to a public container so that this issue can be reproduced by anyone.
Here's how to reproduce what I'm unable to understand:
First get blob properties from Azure using the REST API:
HEAD http://dfdev.blob.core.windows.net/pub/test HTTP/1.1
Host: dfdev.blob.core.windows.net
Response:
HTTP/1.1 200 OK
Content-Length: 66
Content-Type: application/octet-stream
Last-Modified: Sat, 02 Feb 2013 09:37:19 GMT
ETag: 0x8CFCF40075A5F31
Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: 4b149a7e-2fcd-4ab4-8d53-12ef047cbfa1
x-ms-version: 2009-09-19
x-ms-lease-status: unlocked
x-ms-blob-type: BlockBlob
Date: Sat, 02 Feb 2013 09:40:54 GMT
The response headers tell us that this is a block blob and that it has a length of 66 bytes.
Now retrieve the block list from:
http://dfdev.blob.core.windows.net/pub/test?comp=blocklist
Response body:
<?xml version="1.0" encoding="utf-8"?><BlockList><CommittedBlocks /></BlockList>
So, the blob does not have any committed blocks, still it has a length of 66 bytes!
Is this a bug or have I misunderstood something?
Please help me out!
UPDATE 2
I've found that if I upload the blob like this:
container.GetBlockBlobReference("put-only")
.UploadFromStream(File.OpenRead("test-blob"));
...then a single PUT request is sent to Azure and the blob gets an empty block list (just like above).
However, if I upload the blob like this:
var blob = container.GetBlockBlobReference("put-block");
string blockId = Convert.ToBase64String(Guid.NewGuid().ToByteArray());
blob.PutBlock(blockId, File.OpenRead("test-blob"), null);
blob.PutBlockList(new string[] { blockId });
...then two requests are sent to Azure (one for putting the block and another for putting the block list).
The second blob gets a non-empty block list.
Why won't a single PUT yield a block list?
Can't we rely on that the concatenation of a blob's committed blocks are equal to the blob's actual content?!
If not, how shall we determine when the block list is OK and when it's not??
UPDATE 3
I've implemented a workaround for this that I think suffice in the case where we encountered this problem. In case we discover an empty block list AND a blob length that is greater than zero, then we'll assume that everything is OK (although it really isn't) and go ahead and rewrite that data using Put Block and Put Block List at the next opportunity.
However, although this will do the trick in our case, it is still very confusing that a non-empty block blob can have an empty list of committed blocks!!
Is this by-design in Azure? Can anyone explain what's going on?
UPDATE 4
Microsoft confirmed this issue on the MSDN forums too. Quote from Allen Chen:
I've confirmed with the product team. This is a normal behavior. The x-ms-blob-content-length header is the size of the committed blob. In your case you use Put Blob API so all content is uploaded in a single API and is committed in the same request. As a result in the Get Block List API's response you see the x-ms-blob-content-length header has value of 66 which means the committed blob size.
We have been aware of the issue that the MSDN document of the Get Block List API is not quite clear on this and will work on it.