codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How to limit *length* of PrettyPrinter


Redirected from Digest (see below)


On 23/07/2020 11:59, Stavros Macrakis wrote:
 > Mousedancer, thanks!

Yes, I even look like a (younger) Kevin Costner!
(you believe me - right!?)


 > As a finger exercise, I thought I'd try implementing print-level and 
print-length as an object-to-object transformer (rather than a pretty 
printer). I know that has a bunch of limitations, but I thought I might 
learn something by trying.
 >
 > Here's a simple function that will copy nested lists while limiting 
their depth and length. When it encounters a non-iterable object, it 
treats it as atomic:
 >
 >     scalartypes = list(map(type,(1,1.0,1j,True,'x',b'x',None)))
 >
 >     def limit(obj,length=-2,depth=-2):
 >          if type(obj) in scalartypes:
 >              return obj
 >          if depth==0:
 >              return 'XXX'
 >          lencnt = length
 >          try:
 >              new = type(obj).__new__(type(obj)) # empty object of 
same type
 >              for i in obj:
 >                  lencnt = lencnt - 1
 >                  if lencnt == -1:
 >                      new.append('...')          # too long
 >                      break
 >                  else:
 >                      new.append(limit(i,length,depth-1))
 >              return new
 >          except:                                # which exceptions?
 >              return obj                         # not iterable/appendable
 >
 >     limit( [1,2,[31,[321,[3221, 3222],323,324],33],4,5,6], 3,3)
 >
 >             => [1, 2, [31, [321, 'XXX', 323, '...'], 33], '...']
 >
 >
 >
 > This works fine for lists, but not for tuples (because they're 
immutable, so no *append*) or dictionaries (must use *for/in* 
*obj.items*, and there's no *append*). There must be some way to handle 
this generically so I don't have to special-case tuples (which are 
immutable, so don't have *append*) and dictionaries (where you have to 
iterate over *obj.items()*... and there's no *append*), but I'm stuck. 
Should I accumulate results in a list and then make the list into a 
tuple or dictionary or whatever at the end? But how do I do that?
 >
 > It's not clear how I could handle /arbitrary/ objects... but let's 
start with the standard ones.

This looks like fun!
BTW why are we doing it: is it some sort of 'homework assignment' or are 
you a dev 'scratching an itch'?


May I suggest a review of the first few pages/chapters in the PSL docs 
(Python Standard Library): Built-in Functions, -Constants, -Types, and 
-Exceptions. Also, try typing into the REPL:

     pp.__builtins__.__dict__()

(you will recognise the dict keys from the docs). These may give you a 
more authoritative basis for "scalartypes", etc.

If you're not already familiar with isinstance() and type() then these 
(also) most definitely useful tools, and thus worth a read...


With bottom-up prototyping it is wise to start with the 'standard' 
cases! (and to 'extend' one-bite at a time)


Rather than handling objects (today's expansion on the previous), might 
I you refer back to the objective, which (I assume) requires the output 
of a 'screen-ready' string. Accordingly, as the data-structure/network 
is parsed/walked, each recognised-component could be recorded as a 
string, rather than kept/maintained?reproduced in its native form.

Thus:
- find a scalar, stringify it
- find a list, the string is "["
- find a list, the string is "{"
- find a tuple, the string is "("
etc

The result then, is a series of strings.

a) These could be accumulated, ready for output as a single string. This 
would make it easy to have a further control which limits the number of 
output characters.

b) If the accumulator is a list, then

     accumulator.append( stringified_element )

works happily. Plus, the return statement can use a str.join() to 
produce a single accumulator-list as a string.
(trouble is, if the values should be comma-separated, you don't want to 
separate a bracket (eg as a list's open/close) from the list-contents 
with a comma!) So, maybe that should be done at each layer of nesting?

Can you spell FSM?
(Finite State Machine)


Next set of thoughts: I'm wondering if you mightn't glean a few ideas 
from reviewing the pprint source-code?
(on my (Fedora-Linux) machine it is stored as 
/usr/lib64/python3.7/pprint.py)

Indeed, with imperial ambitions of 'embrace and extend', might you be 
able to sub-class the pprint class and bend it to your will?


Lastly, (and contrasting with your next comment) I became a little 
intrigued, so yesterday, whilst waiting for an on-line meeting's (rather 
rude, IMHO) aside to finish (and thus move-on to topics which involved 
me!), I had a little 'play' with the idea of a post-processor (per 
previous msg).

What I built gives the impression that "quick and dirty" is a 
thoroughly-considered and well-designed methodology, but the prototype 
successfully shortens pprint-output to a requisite number of elements. Thus:

     source_data = [1,2,[31,[321,[3221, 3222],323,324],33],4,5,6]
     limit( source_data, 3 )

where the second argument (3) is the element-count/-limit; results in:

     [1,2,[31

ie the first three elements extracted from nested lists (tuples, sets, 
scalars, etc).
(recall my earlier query about what constitutes an "element"?)


 > Sorry for the very basic questions!

No such thing - what is "basic" to you, might seem 'advanced' so someone 
else, and v-v. Plus, you never know how many 'lurkers' (see below) might 
be quietly-benefiting from their observation of any discussion!


PS on which subject, List Etiquette:

There are many people who 'lurk' on the list - which is fine. Presumably 
they are able to read contributions and learn from what seems 
interesting. This behavior is (to me) a major justification for the 
digest service - not being 'bombarded' by many email msgs is how some 
voice their concerns/preference.

However, once one asks a question, one's involvement is no longer 
passive ('lurking'). Hence:

 >     When replying, please edit your Subject line so it is more specific
 >     than "Re: Contents of Python-list digest..."
...

 >        16. Re: How to limit *length* of PrettyPrinter (dn)
...

Further, many of us manage our email 'bombardment' through 
'organisation' rather than 'limitation' (or 'condensation'?); and thus 
"threading" is important - most competent mail-clients offer this, as do 
GMail and many web-mail services. From a list perspective, this collects 
and maintains all parts of a conversation - your contributions and mine, 
in the 'same place'. Sadly, switching between the list-digest and 
single-messages breaks threading! Also, no-one (including the archiving 
software) looking at the archive (or the digest) would be able to detect 
any link between an earlier conversation called "How to limit *length* 
of PrettyPrinter" and one entitled "...Digest..."!
-- 
Regards =dn

-- 
Regards =dn