Skip to content

feat(storage): log warning when GCS serves more bytes than requested#16141

Open
v-pratap wants to merge 17 commits into
googleapis:mainfrom
v-pratap:log-additional-bytes
Open

feat(storage): log warning when GCS serves more bytes than requested#16141
v-pratap wants to merge 17 commits into
googleapis:mainfrom
v-pratap:log-additional-bytes

Conversation

@v-pratap
Copy link
Copy Markdown
Contributor

@v-pratap v-pratap commented Jun 8, 2026

Implement a thread-safe, one-shot mechanism to detect when a range request is overshot, and print an actionable warning log at the end of the stream.

This is implemented for all kinds of reads Sync, Async, MRD.
The warning log contains: bucket name, object name, and the amount of extra bytes served.

@v-pratap v-pratap requested review from a team as code owners June 8, 2026 08:18
@product-auto-label product-auto-label Bot added the api: storage Issues related to the Cloud Storage API. label Jun 8, 2026
Implement a thread-safe, one-shot mechanism to detect when a range request is overshot, and print an actionable warning log at the end of the stream.

This is implemented for:
1. ObjectReadStreambuf (Single-Stream Readers)
2. ReadRange (Concurrent / Multi-Range Download Managers)

The warning log contains rich context: bucket name, object name, and the amount of extra bytes served.

Bug: b/455959466
@v-pratap v-pratap force-pushed the log-additional-bytes branch from 5cfbb21 to 9f95aff Compare June 8, 2026 08:20
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces overrun logging to detect and warn when more bytes are received from GCS than requested, updating both ReadRange and ObjectReadStreambuf to track received bytes and log warnings upon overrun. The review feedback highlights a critical issue in ObjectReadStreambuf where remain_ is only decremented in xsgetn() but not in underflow(), which will cause inaccurate tracking for character-by-character or line-by-line reads. Additionally, it is recommended to add test coverage for these underflow() scenarios.

Comment on lines +251 to +256
{
std::lock_guard<std::mutex> lk(mu_);
if (remain_.has_value()) {
*remain_ -= read->bytes_received;
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In ObjectReadStreambuf, bytes can be read from source_ either via xsgetn() (for bulk reads) or via underflow() (when filling the internal buffer for single-character or line-based reads).

Currently, remain_ is only decremented inside xsgetn(). If a user reads from the stream using methods that trigger underflow() (such as std::getline, stream >> var, or stream.get()), those received bytes will not be subtracted from remain_. This will result in inaccurate tracking and cause the overrun warning to be missed or calculated incorrectly.

Please ensure that underflow() also decrements remain_ under the lock when it successfully receives bytes from source_.

Comment thread google/cloud/storage/internal/object_read_streambuf_test.cc
v-pratap added 9 commits June 8, 2026 08:31
…narios

Add unit tests to verify overrun logging behavior under:
1. Character-by-character reads (using stream.get()) in ObjectReadStreambuf.
2. ReadLast(N) requests where GCS returns less than N (no logging expected).
3. ReadLast(N) requests where GCS returns more than N (logging expected).

These tests cover both synchronous (ObjectReadStreambuf) and
asynchronous (ReadRange) implementation paths.
- Skip overrun warning in ObjectReadStreambuf if decompressive
  transcoding is active (transformation().has_value() is true),
  matching the behavior of other SDKs (like Go) and avoiding false
  positives.
- Initialize remain_ to 0 for empty range requests (range.end == begin)
  in ObjectReadStreambuf to ensure that any erroneous data sent by
  GCS is caught and logged as an overrun.
- Add corresponding unit tests to verify these behaviors.
- Expose Remain() in the public ObjectReadStream class to allow
  users to programmatically query the remaining bytes in a range
  request, improving consistency with the Go SDK.
- Add thread-safety comments (// Must be called while holding mu_.)
  to ReadRange::CheckOverrun() in read_range.h and read_range.cc
  to document lock requirements.
…verrun logging

- Add MetadataAccessor callback to ReadRange to allow it to
  dynamically query the object's metadata from ObjectDescriptorImpl.
- Update ReadRange::CheckOverrun() to check if the object's
  content_encoding is "gzip". If so, suppress the overrun warning,
  matching the behavior of the synchronous path and other SDKs (like Go)
  to avoid false positives during decompressive transcoding.
- Pass the metadata accessor callback from ObjectDescriptorImpl::Read
  using a weak pointer to avoid circular references.
- Add TranscodingSuppressesWarning unit test to read_range_test.cc
  to verify this behavior.
- Expand the lock scope in xsgetn() to protect all metadata updates
  (generation_, metageneration_, storage_class_, size_,
  transformation_, headers_, and source_pos_) under the mu_ lock,
  preventing data races when these are accessed concurrently.
- Refactor all metadata getters in ObjectReadStreambuf and
  ObjectReadStream to lock mu_ and return by value instead of const&
  to guarantee thread safety during concurrent reads.
- Fix a critical self-deadlock in ObjectReadStreambuf::Close() by
  accessing the transformation_ member variable directly instead of
  calling the transformation() getter (which attempts to re-acquire
  the already-held mu_ lock).
…ructor

- Remove default arguments for bucket_name and object_name in the
  primary ReadRange constructor, making them mandatory.
- Delete the fallback constructor (without bucket/object names) to
  prevent any loophole for empty logs.
- Update all unit tests in read_range_test.cc to pass explicit
  bucket and object names.
…tream

- Refactor status(), received_hash(), and computed_hash() in
  ObjectReadStreambuf and ObjectReadStream to lock mu_ and return
  by value instead of const&.
- This completes the thread-safety synchronization of the synchronous
  path, preventing data races when these getters are accessed
  concurrently with active stream reads.
- Refactor requested_length_ in ReadRange to be an
  absl::optional<std::int64_t>. This allows us to cleanly distinguish
  between "no limit (read to end)" (absl::nullopt) and a strict limit
  (e.g. 0 for a zero-length request).
- Update CheckOverrun() to check *requested_length_ >= 0, allowing
  overrun detection on zero-length requests, bringing parity with the
  Go SDK and the C++ synchronous path.
- Update ObjectDescriptorImpl::Read to pass
  p.length > 0 ? absl::make_optional(p.length) : absl::nullopt to the
  ReadRange constructor.
- Add a new ZeroLengthOverrunLogging test case to read_range_test.cc
  to verify that an explicit 0-byte limit overrun is correctly caught
  and logged.
…ngth overrun checks

- Wrap source_pos_ access in ObjectReadStreambuf::seekoff() with
  std::lock_guard<std::mutex> to prevent data races when tellg()
  is called concurrently with active stream reads.
- Add read_to_end boolean flag (defaulting to false) to
  ObjectDescriptorConnection::ReadParams in a backwards-compatible way.
- Update ObjectDescriptor::ReadFromOffset and ReadLast to explicitly
  set read_to_end = true, while Read sets it to false.
- Update ObjectDescriptorImpl::Read to use p.read_to_end to cleanly
  distinguish "read to end" (passing absl::nullopt) from a strict
  limit (passing p.length, which can be 0 for a zero-length request).
  This allows ReadRange to detect overruns on 0-byte async requests
  without false positives on full downloads.
@v-pratap v-pratap added the do not review Indicates a PR is not ready for review label Jun 8, 2026
v-pratap added 4 commits June 8, 2026 09:27
…rrun checks

- Remove the MetadataAccessor callback from ReadRange constructor and
  replace it with a simple is_transcoded boolean parameter in OnRead().
  This completely avoids the ABBA deadlock and self-deadlock risk during
  OnFinish where we would otherwise try to acquire
  ObjectDescriptorImpl::mu_ while holding ReadRange::mu_.
- Update ObjectDescriptorImpl::OnRead to evaluate the transcoding status
  and pass it to ReadRange::OnRead().
- Update all unit tests in read_range_test.cc to pass the transcoding
  status directly to OnRead, dramatically simplifying the tests by
  removing the verbose metadata mocks.
- Update ObjectReadStreambuf constructor to use >= 0 for ReadLast
  option handling, enabling overrun checks on explicit ReadLast(0)
  requests.
Latch the is_transcoded_ value to true if it is ever true during the
stream's lifetime in ReadRange::OnRead(). This is marginally safer
and prevents any edge cases where subsequent chunks might report
different metadata.
… races in sync

- Move logged_warning_ = true; inside the overrun check if block in
  ReadRange::CheckOverrun(). This ensures that we don't prematurely
  suppress overrun checks on subsequent chunks in a multi-chunk stream
  if the first chunk was within limits.
- Add MultiChunkOverrunLogging test case to read_range_test.cc to
  verify that overrun is correctly caught on a second chunk.
- Add std::lock_guard to ObjectReadStreambuf::ReportError(),
  CheckPreconditions(), and ValidateHashes() to prevent concurrent
  data races on status_, hash_validator_result_, and metadata fields.
- Make ObjectReadStreambuf::IsOpen() thread-safe and nullptr-safe
  by locking mu_ and checking source_ != nullptr.
- Update CheckPreconditions() and ValidateHashes() to access
  source_ and transformation_ directly under the lock, and update
  ThrowHashMismatchDelegate() to use member variables directly,
  preventing lock re-entrancy deadlocks on non-recursive std::mutex.
- Update ObjectDescriptorImpl::Read to calculate the correct limit for
  ReadLast requests (where p.start < 0). Since ReadLast(limit)
  requests the last limit bytes, the limit is -p.start. This enables
  overrun detection on ReadLast requests without changing the gRPC
  protocol behavior (which still receives read_length = 0).
- Add ReadLastOverrunLogging test case to read_range_test.cc to
  verify that overrun is correctly caught and logged for ReadLast
  requests.
Copy link
Copy Markdown
Contributor

@kalragauri kalragauri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need similar changes under AsyncClient::ReadObject?

hash_validator_(CreateHashValidator(request)),
bucket_name_(request.bucket_name()),
object_name_(request.object_name()) {
if (request.HasOption<ReadRange>() && request.HasOption<ReadFromOffset>()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user performs a full object read (no range options) and GCS overshoots, it will not be logged because remain_ will be nullopt.


offset_ += content.size();
received_bytes_ += content.size();
if (length_ != 0) length_ -= std::min<std::int64_t>(content.size(), length_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't modify this line but if GCS overshoots, length_ will be set to 0 here. If the connection breaks, ReadRange::RangeForResume will be called to resume the connection, and read_length will be set to 0 in the proto: range.set_read_length(length_);

IIUC, read_length = 0 is interpreted as full object read, which will cause the resumed stream to download the entire remainder of the object.

v-pratap added 3 commits June 8, 2026 10:16
…ync path

- Update ReadRange::OnRead to ignore checksum mismatches if the object
  is transcoded (is_transcoded_ is true). When GCS dynamically
  decompresses (transcodes) a gzip-compressed object, the received bytes
  are uncompressed, but GCS metadata hashes belong to the original
  compressed file, which would otherwise always trigger a false-positive
  DataLossError.
- Add TranscodingIgnoresChecksumMismatch test case to
  read_range_test.cc to verify that a checksum mismatch is ignored
  and the read succeeds when is_transcoded is true.
…overrun checks

- Update ReadRange::RangeForResume to return absl::nullopt (refuse to
  resume) if the requested range has already been fully satisfied or
  exceeded. This prevents a catastrophic bug where a completed range
  read would be resumed with read_length = 0 (unlimited), downloading
  the rest of the file and causing massive unexpected egress costs.
- Add NoResumeIfRequestSatisfied and NoResumeIfRequestExceeded tests
  to read_range_test.cc to verify this behavior.
- Move CheckOverrun() in ReadRange::OnRead to be called immediately
  on every chunk rather than waiting for range_end(), enabling immediate
  warnings if a rogue backend sends infinite data.
- Update ObjectReadStreambuf to initialize remain_ to *size_ as
  soon as the object's total size is learned in xsgetn, enabling
  overrun detection for full-object downloads.
- Move the remain_ decrement to the end of the lock block in
  ObjectReadStreambuf::xsgetn to ensure the first chunk's bytes are
  correctly subtracted from the newly-learned size.
- Add FullObjectReadOverrunLogging test to
  object_read_streambuf_test.cc to verify full object overrun logging.
- Remove the redundant std::lock_guard from
  ObjectReadStreambuf::seekoff() since streams are inherently
  not thread-safe for concurrent read/seek operations.
Add an explanatory comment in ObjectReadStreambuf::xsgetn to document
the dynamic late-initialization strategy for remain_ during full-object
reads. Since the expected size is not known upfront for full reads, we
initialize remain_ to the object's size the first time we learn it
from the GCS response metadata.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the Cloud Storage API. do not review Indicates a PR is not ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants