Comments for weaselhat http://www.weaselhat.com Sat, 08 Jan 2022 22:00:40 +0000 hourly 1 https://wordpress.org/?v=6.3.2 Comment on Processing Semi-Structured Data in the Unix Shell by Michael Greenberg http://www.weaselhat.com/2021/06/29/processing-semi-structured-data-in-the-unix-shell/comment-page-1/#comment-187705 Sun, 04 Jul 2021 23:28:33 +0000 http://www.weaselhat.com/?p=936#comment-187705 Thanks! I had never heard of 9p, and it seems cool! I had some trouble finding details, but my impression is that FUSE is faster than 9p, and FUSE itself is already substantially slower than in-kernel filesystems. Bento seems like a cool option (and it’s Rust based!). Please feel free to chime in on my tracking issue <https://github.com/mgree/ffs/issues/9> if you have thoughts on performance!

I’m hopeful that fuser will just get Windows support (https://github.com/cberner/fuser/issues/129) and I can use that.

]]>
Comment on Processing Semi-Structured Data in the Unix Shell by Jay McCarthy http://www.weaselhat.com/2021/06/29/processing-semi-structured-data-in-the-unix-shell/comment-page-1/#comment-187595 Sun, 04 Jul 2021 14:37:17 +0000 http://www.weaselhat.com/?p=936#comment-187595 You might consider using 9p instead of FUSE, or in addition to. This is very cool.

]]>
Comment on Cast notation: a case study by James McKinna http://www.weaselhat.com/2020/09/30/cast-notation-a-case-study/comment-page-1/#comment-116694 Wed, 14 Oct 2020 11:58:39 +0000 http://www.weaselhat.com/?p=900#comment-116694 Michael,

thanks again for the provocation in your SIGPLAN blog piece, and the follow-up here focusing in on cast notation.

In his 2015 SNAPL paper “A Complement to blame”, Phil Wadler attributes the introduction of the ‘e : S => T’ notation (NB single colon!) to the 2011 POPL paper “Blame for All” (with Amal Ahmed, Robby Findler and Jeremy Siek), in the course of an interesting discussion of (the evolution of) notation for contracts, casts and blame over a 15–20 year period.

The discussion makes clear that it indeed can take a long time for the judicious choice of symbolism to emerge, measured in terms of cognitive load as you have it in your SIGPLAN article; in Phil’s case, he makes clear the appeal to two factors, left-to-right reading (for Western authors! a reminder to check our privilege, if ever there was one) order, together with the ‘free ride’ of being able to concatenate two casts, but *not* eliding the middle term (as Jeremy or others might unwittingly surmise as a further potential ‘free ride’).

Interestingly, Phil makes a separate point there, namely that using a C-like cast notation (he mentions Siek/Taha postfix ‘@’, but… mutatis mutandis) appears to require cast reduction to depend on type inference, in order to be able to push through casts at higher type in the ‘usual’ (Wrap) rule. So his justification for source/target casting (and the subsequent notational choice) has a semantic component to it, as well as the visual.

But it seems that if you take our 2019 WGT paper which uses an ‘unwrapped’ rule for higher-type casts, then such a rule would work perfectly fine, if more opaquely, written ‘@’-style (I’d write it, if I could figure out how to do so cleanly in markdown [fixed –MMG]). A pity we didn’t notice that at the time, though!

2019 WGT:

lam (x:A). M[x] : (x:A) -> B[x] => (y:C) -> D[y] -->
lam (y:C). let x = (y:C => A) in (M[x]:B[x] => D[y])

versus

lam (x:A). M[x] @ (y:C) -> D[y] -->
lam (y:C). let x = (y @ A) in (M[x] @ D[y])

And then the choice between notation ‘M @ A’ versus your preference for Carnapian functor/argument style ‘cast(M, A)’ (say), starts to seem a more modest cognitive gap to have to bridge… and with no ‘accidental’ free rides, either: ‘M @ A @ B’ can’t (seemingly) elide ‘A’ without doing some (object-level) reduction. But even then, how much of this is itself a ‘happy’ accident of our particular (syntactic and semantic) choices?

]]>
Comment on Formulog: ML + Datalog + SMT by Michael Greenberg http://www.weaselhat.com/2020/08/07/formulog-ml-datalog-smt/comment-page-1/#comment-105478 Mon, 10 Aug 2020 16:15:28 +0000 http://www.weaselhat.com/?p=835#comment-105478 In reply to yaseen.

That’s a very good idea! I have an undergraduate student who will be helping me convert Smaragdakis and Balatsouras’s “Pointer Analysis” tutorial, but that’s for a modestly complex language. I’ll see what we can do.

]]>
Comment on Formulog: ML + Datalog + SMT by yaseen http://www.weaselhat.com/2020/08/07/formulog-ml-datalog-smt/comment-page-1/#comment-105334 Sun, 09 Aug 2020 00:05:43 +0000 http://www.weaselhat.com/?p=835#comment-105334 can you guys write a tutorial that targets a really basic language please

]]>
Comment on Formulog: ML + Datalog + SMT by brian herman http://www.weaselhat.com/2020/08/07/formulog-ml-datalog-smt/comment-page-1/#comment-105327 Sat, 08 Aug 2020 16:05:42 +0000 http://www.weaselhat.com/?p=835#comment-105327 cool

]]>
Comment on Smart constructors are smarter than you think by Michael Greenberg http://www.weaselhat.com/2020/05/07/smart-constructors-are-smarter-than-you-think/comment-page-1/#comment-98171 Sat, 27 Jun 2020 22:35:57 +0000 http://www.weaselhat.com/?p=785#comment-98171 In reply to Neel Krishnaswami.

Yes, indeed: I figured this out for myself in the followup post!

Thanks for the link to the Antimirov stuff! One reason I like the Brzozowski implementation more than the Antimirov one is that you can build an implicit-state DFA matcher with just a left fold. Stephanie Weirich did this in a fancily typed way at her POPL 2017 keynote.

]]>
Comment on Smart constructors are smarter than you think by Neel Krishnaswami http://www.weaselhat.com/2020/05/07/smart-constructors-are-smarter-than-you-think/comment-page-1/#comment-97839 Fri, 26 Jun 2020 10:50:50 +0000 http://www.weaselhat.com/?p=785#comment-97839 Hi Michael,

If you don’t do any normalization, then Brzozowski derivatives can grow exponentially Consider the regular expression a*.a**, and then take its derivative with respect to a

d_a(a*.a**) = d_a(a*).a** + d_a(a**)
= a*.a** + d_a(a*).a**
= a*.a** + a*.a**

This regular expression has doubled in size, and each derivative with respect to a will double it again.

However, Brzozowski proved in his original paper that if you quotient derivatives by “similarity” (basically ACU with respect to +, but only at the outermost level), then there are only a finite number of word derivatives. So a smart constructor implementation smart enough to ensure that (alt r r == r), (alt r1 r2 == alt r2 r1) and (alt (alt r1 r2) r3 == alt r1 (alt r2 r3)) suffices to ensure you only ever build a finite number of states!

I found the proof in his original paper very obscure, though, and didn’t understand it until I read Antimirov’s paper on partial derivatives. That yields an even simpler implementation technique than Brzozowski’s original — you can use it to implement a complete table-driven DFA-based matcher in ~50 lines of code. See

https://semantic-domain.blogspot.com/2013/11/antimirov-derivatives-for-regular.html

]]>
Comment on The Big Brzozowski by gasche http://www.weaselhat.com/2020/05/08/the-big-brzozowski/comment-page-1/#comment-92026 Thu, 28 May 2020 16:07:27 +0000 http://www.weaselhat.com/?p=819#comment-92026 In reply to Michael Greenberg.

(You are of course right, the counting argument is flawed.)

]]>
Comment on The Big Brzozowski by Michael Greenberg http://www.weaselhat.com/2020/05/08/the-big-brzozowski/comment-page-1/#comment-87845 Sat, 09 May 2020 14:56:58 +0000 http://www.weaselhat.com/?p=819#comment-87845 In reply to gasche.

Yes, minimization is an interesting place to go! I agree that there should be no free lunch here—we’re simply in exponential territory. Owens, Reppy, and Turon’s paper explores the space of smart constructors a bit—I’m curious about how the different choices here affect proxy measures (regex size and number of states) and direct measures (performance of matching and equivalence checking and explicit enumeration).

The worst blowup doesn’t come from e1|e2—it comes from nullable sequences, where d c (seq e1 e2) = alt (seq (d c e1) e2) (d c e2). Notice that we have both derivatives _and_ the original e2.

I morally agree with you about the post facto inevitability, but I’m not totally following the counting argument. There are exponentially many regular expressions of a given size (e.g., 151 950 syntactically distinct regular expressions of size 6 and 1 468 590 of size 7). It seems to me that, in principle, you could have exponentially many iterated derivatives of a given expression without needing to grow so large. On some level, there shouldn’t be room at sizes n+1 through n+k for exponentially many derivatives from size n and exponentially many new, distinct regular expressions, and their derivatives, and, and… but it’s not obvious to me that it’s impossible to have room.

]]>