Skip to content

Releases: lmmx/range-streams

Null-terminated file size headers now permitted

21 Apr 21:13
Compare
Choose a tag to compare

Async support and fetch helper functions

12 Aug 20:51
Compare
Choose a tag to compare
  • 🚀 Async now supported for streams in single request mode
  • RangeStream and its subclasses now have a make_async_fetcher method which will return an AsyncFetcher that you can pass a list of URLs to (optionally: a client to use), and an async callback function (which will receive: the AsyncFetcher, the RangeStream sub/class (with the received response), and the source URL which was requested)

Example showing the speedup this brings to PNGs, due to them being a very linear (unlike e.g. zip files) and many-chunked format which leads to a large number of requests which annihilate non-async performance:

import range_streams
from range_streams.codecs.png import PngStream
from some_urls import urls
import httpx
from tqdm import tqdm
from time import time
from random import shuffle

async def callback(fetcher, stream, url):
    await stream.enumerate_chunks_async()

sample_size = 60
dbl_smpl_sz = 2 * sample_size
print(f"Sampling {sample_size} of {len(urls)} PNG URLs each time (200px thumbnails)\n")

for i in range(4):
    urls_sample = [u for u in urls[:dbl_smpl_sz]]
    shuffle(urls_sample) # Split a random sample
    urls_a, urls_b = urls_sample[0:sample_size], urls_sample[sample_size:dbl_smpl_sz]

    print("Synchronous")
    t0 = time()
    c = httpx.Client()
    for u in tqdm(urls_a):
        s = PngStream(url=u, client=c, enumerate_chunks=True, scan_ihdr=False)
        del s
    t1 = time()
    print(f"Done in {t1-t0}s")

    print("Asynchronous")
    fetched = PngStream.make_async_fetcher(urls=urls_b, callback=callback)
    fetched.make_calls()
    t2 = time()
    print(f"Done in {t2-t1}s")
    print()

async-range-streams-speed-test

Single request mode

04 Aug 16:07
Compare
Choose a tag to compare
  • 🔍 Windowed ranges
  • 🚫 HEAD requests replaced with open-ended partial content requests
  • ⚡ Single request mode with the single_request flag (default False for all RangeStream sub/classes except the PNG codec PngStream)
  • 🐛 Bug fixes and more tests to stop them coming back!

ZIP, tar, conda, and PNG codec support

29 Jul 13:08
Compare
Choose a tag to compare

PyPI rollout

13 Jul 16:23
Compare
Choose a tag to compare

🚢 🏷️ v0.1.0: “PyPI rollout”

  • Efficient and lightweight range streaming with file-like objects

  • 3 pruning levels when handling range overlaps

    1. "replant": existing ranges that overlap a newly requested range will be re-requested or have their iterator truncated
    2. "burn": existing ranges that overlap a newly requested range will be deleted
    3. "strict": existing ranges that overlap a newly requested range will raise errors
  • 98% test coverage

  • New design docs

  • Fully type checked

  • 📦 PyPI v0.1.0