Are there timing diagrams that determine how many nop's are needed? Did you just experiment? Were four nop's used earlier
viewtopic.php?p=1914784#p1914784
to obtain the 0.000300 timing?
If the Pico is overclocked do the number of nop's change?
Are there timing diagrams that determine how many nop's are needed? Did you just experiment? Were four nop's used earlier
Last night I started a job to check up to 10000000000000 (two more zeros) and didn't find any more base 2 and 3 palindromes. The run took 7 hours. At this rate is will take about 7000000 more hours (800 years) to check all 64-bit integers.ejolson wrote: ↑Fri Sep 17, 2021 3:23 amI fixed the overhead related to the string allocation problems in my Go code and then found another number that is simultaneously a palindrome is bases 2 and 3.lurk101 wrote: ↑Fri Sep 17, 2021 3:12 amJust for grins...
On a 600MHz single core Cortex M7
Code: Select all
872187 Run time 0.000012 seconds
My list is now:I tested numbers up to 100000000000. Unfortunately, none these are also palindromes in base 5. Hopefully I didn't miss any. My code is based on the algorithm which appeared in the first post, so clearly some optimization could help.Code: Select all
6643 1100111110011 100010001 1422773 101011011010110110101 2200021200022 5415589 10100101010001010100101 101012010210101 90396755477 1010100001100000100010000011000010101 22122022220102222022122
At any rate, in order to get the program to scale well to multiple cores I learned something more about slices in Go and am happy about that.
I followed the approach suggested earlier and created a palindrome generator that only tests known base-3 palindromes to see if they are also palindromes in base 2. Woohoo! I found another double base-2 base-3 palindrome:ejolson wrote: ↑Fri Sep 17, 2021 8:52 pmLast night I started a job to check up to 10000000000000 (two more zeros) and didn't find any more base 2 and 3 palindromes. The run took 7 hours. At this rate is will take about 7000000 more hours (800 years) to check all 64-bit integers.ejolson wrote: ↑Fri Sep 17, 2021 3:23 amI fixed the overhead related to the string allocation problems in my Go code and then found another number that is simultaneously a palindrome is bases 2 and 3.lurk101 wrote: ↑Fri Sep 17, 2021 3:12 amJust for grins...
On a 600MHz single core Cortex M7
Code: Select all
872187 Run time 0.000012 seconds
My list is now:I tested numbers up to 100000000000. Unfortunately, none these are also palindromes in base 5. Hopefully I didn't miss any. My code is based on the algorithm which appeared in the first post, so clearly some optimization could help.Code: Select all
6643 1100111110011 100010001 1422773 101011011010110110101 2200021200022 5415589 10100101010001010100101 101012010210101 90396755477 1010100001100000100010000011000010101 22122022220102222022122
At any rate, in order to get the program to scale well to multiple cores I learned something more about slices in Go and am happy about that.
Do you think more than 64-bit integers will be needed to find a number that is simultaneously a palindrome is bases 2, 3 and 5? Is there even is such a number?
Since people have already optimized codes to find numbers which are simultaneously palindromes in bases 2 and 10, are any of those numbers also palindromes in base 3?
Code: Select all
381920985378904469
10101001100110110110001110011011001110001101101100110010101
2112200222001222121212221002220022112
My 300 number has pio_sm_put() with pio_sm_get_blocking(). I didn't go the nop route defensively, but now release there isn't really any way this result can be delayed - I am used to DMA-ing stuff to/from the PIO. So non blocking with nops is fineejolson wrote: ↑Fri Sep 17, 2021 8:32 pmAre there timing diagrams that determine how many nop's are needed? Did you just experiment? Were four nop's used earlier
viewtopic.php?p=1914784#p1914784
to obtain the 0.000300 timing?
If the Pico is overclocked do the number of nop's change?
Yes, lucky with the NOPs, but core 0 uses PIO0 and core 1 uses PIO1, why different SMs?kilograham wrote: ↑Sat Sep 18, 2021 12:18 amAlso you are getting lucky with your core interleaving, as you should really use a different SM for both cores.
Code: Select all
static unsigned int palindrome(unsigned int n, PIO pio) {
pio_sm_put(pio, 0, n);
__asm__ __volatile__("nop" ::: "memory");
__asm__ __volatile__("nop" ::: "memory");
return ((pio_sm_get(pio, 0) >> boot_clz32(n)) == n) ? n : 0;
}
Code: Select all
Sum is 872187
Run time is 0.000284 seconds
Code: Select all
Sum is 872187
Run time is 0.000206 seconds
For comparison of languages, here is a Go program that solves the original problem:
Code: Select all
/* euler36.go -- Double-base palindromes
Written 2021 by Eric Olson */
package main
import ( . "fmt"; . "strconv"; . "runtime"; . "os"
"flag"; "unsafe"; "time")
const (
b1 = 2; b2 = 10
)
var (
xmin,xlim uint64
tictime float64
)
func tic() {
now:=time.Now()
tictime=float64(now.Unix())+1.0E-9*float64(now.Nanosecond())
}
func toc() float64 {
now:=time.Now()
return float64(now.Unix())+1.0E-9*float64(now.Nanosecond())-tictime
}
func o1string(b []byte) string { // avoid the heap
return *(*string)(unsafe.Pointer(&b))
}
func dowork(r int,s uint64,c chan uint64) {
x:=make([]byte,256)
mkpale:=func(i uint64,b int) uint64 {
x=x[:0]
x=AppendUint(x,i,b) // avoid the lock
m:=len(x)
x=x[:2*m]
for i:=m;i<len(x);i++ {
x[i]=x[len(x)-i-1]
}
r,e:=ParseUint(o1string(x),b,64)
if e!=nil||r>=xlim { return 0 }
return r
}
mkpalo:=func(i uint64,b int) uint64 {
x=x[:0]
x=AppendUint(x,i,b) // avoid the lock
m:=len(x)
x=x[:2*m-1]
for i:=m;i<len(x);i++ {
x[i]=x[len(x)-i-1]
}
r,e:=ParseUint(o1string(x),b,64)
if e!=nil||r>=xlim { return 0 }
return r
}
ispal:=func(i uint64,b int) bool {
x=x[:0]
x=AppendUint(x,i,b) // avoid the lock
for i:=0; i<len(x)/2; i++ {
if x[i]!=x[len(x)-i-1] {
return false
}
}
return true
}
count:=uint64(0)
for i:=xmin+uint64(r);;i+=s {
j:=mkpalo(i,b2)
if j>0 && ispal(j,b1) { count+=j }
k:=mkpale(i,b2)
if k>0 && ispal(k,b1) { count+=k }
if j==0 && k==0 { break }
}
c<-count
}
func main(){
ncpu:=GOMAXPROCS(0)
Printf("euler36--Double-base palindromes\n"+
"Written 2021 by Eric Olson (GOMAXPROCS=%d)\n\n",ncpu)
flag.Uint64Var(&xmin,"a",1,"First prefix to sum.")
flag.Uint64Var(&xlim,"b",1000000,"Upper limit to sum.")
flag.Parse()
Printf("Summing palindromes from %d up to %d...",xmin,xlim)
ret:=make(chan uint64,ncpu)
tic()
for r:=1; r<ncpu; r++ {
go dowork(r,uint64(ncpu),ret)
}
dowork(0,uint64(ncpu),ret)
count:=uint64(0)
for r:=0; r<ncpu; r++ { count+=<-ret }
tsec:=toc()
Printf("done\n")
Printf("\nSum of all double-base palindromes: %d\n"+
"Calculation time %g seconds.\n",count,tsec)
Exit(0)
}
Code: Select all
$ ./euler36
euler36--Double-base palindromes
Written 2021 by Eric Olson (GOMAXPROCS=4)
Summing palindromes from 1 up to 1000000...done
Sum of all double-base palindromes: 872187
Calculation time 0.0005865097045898438 seconds.
ah, sorry i wasn't paying attention. as long as they aren't using the same PIO/SM then yes fine.lurk101 wrote: ↑Sat Sep 18, 2021 12:36 amYes, lucky with the NOPs, but core 0 uses PIO0 and core 1 uses PIO1, why different SMs?kilograham wrote: ↑Sat Sep 18, 2021 12:18 amAlso you are getting lucky with your core interleaving, as you should really use a different SM for both cores.
Code: Select all
static unsigned int palindrome(uint sm, unsigned int n) {
pio_sm_put(pio0, sm, n);
unsigned int l = faster_clz(n);
return (pio_sm_get(pio0, sm) >> l) == n ? n : 0;
}
Code: Select all
Sum is 872187
Run time is 0.000263 seconds
Woohoo! I'm happy that getting rid of the nop's makes things even faster. Do you need any volatile keywords in that code so the compiler doesn't reorganize it?kilograham wrote: ↑Sat Sep 18, 2021 5:59 pmyou can of course save a few cycles by doing useful work instead of nops.Code: Select all
static unsigned int palindrome(uint sm, unsigned int n) { pio_sm_put(pio0, sm, n); unsigned int l = faster_clz(n); return (pio_sm_get(pio0, sm) >> l) == n ? n : 0; }
Code: Select all
Sum is 872187 Run time is 0.000263 seconds
Also on Pi4, the rust versionejolson wrote: ↑Sat Sep 18, 2021 3:50 amOn a Pi 4B a typical run looks likeCode: Select all
$ ./euler36 euler36--Double-base palindromes Written 2021 by Eric Olson (GOMAXPROCS=4) Summing palindromes from 1 up to 1000000...done Sum of all double-base palindromes: 872187 Calculation time 0.0005865097045898438 seconds.
Code: Select all
#![feature(core_intrinsics)]
use bit_reverse::ParallelReverse;
use std::{intrinsics::ctlz, time::SystemTime};
fn palindrome(x : u32) -> u32 {
if (x.swap_bits() >> ctlz(x)) == x {
return x;
}
0
}
fn search() -> u32 {
let mut sum = 0;
for i in (1..10).step_by(2) {
sum = sum + palindrome(i) + palindrome(i * 11);
for j in 0..10 {
sum = sum + palindrome(i * 101 + j * 10) + palindrome(i * 1001 + j * 110);
for k in 0..10 {
sum = sum + palindrome(i * 10001 + j * 1010 + k * 100) +
palindrome(i * 100001 + j * 10010 + k * 1100);
}
}
}
sum
}
fn main() {
let start = SystemTime::now();
let sum = search();
let finish = SystemTime::now();
println!("{} {:?}", sum, finish.duration_since(start));
}
Code: Select all
872187 Ok(4.999µs)
Faster yet by moving search and palindrome functions to ram. Might be the end of the road speeding this one up?kilograham wrote: ↑Sat Sep 18, 2021 5:59 pmyou can of course save a few cycles by doing useful work instead of nops.Code: Select all
static unsigned int palindrome(uint sm, unsigned int n) { pio_sm_put(pio0, sm, n); unsigned int l = faster_clz(n); return (pio_sm_get(pio0, sm) >> l) == n ? n : 0; }
Code: Select all
Sum is 872187 Run time is 0.000263 seconds
Orkilograham wrote: ↑Sat Sep 18, 2021 5:59 pmyou can of course save a few cycles by doing useful work instead of nops.Code: Select all
static unsigned int palindrome(uint sm, unsigned int n) { pio_sm_put(pio0, sm, n); unsigned int l = faster_clz(n); return (pio_sm_get(pio0, sm) >> l) == n ? n : 0; }
Code: Select all
static unsigned int __not_in_flash_func(palindrome)(unsigned int n, uint32_t sm) {
pio_sm_put(pio0, sm, n);
uint32_t n2 = n << boot_clz32(n); // do the shift while PIO is crunching
return n2 == pio_sm_get(pio0, sm) ? n : 0;
}
On the walk it occurred to me, since it's possible to generate palindromes in order for any base, that one could create a base-two palindrome generator, a base-10 palindrome generator and then check when the two intersect.
Code: Select all
/* palgen.go -- Generator approach to multi-base palindromes
Written 2021 by Eric Olson */
package main
import (
. "fmt"; . "os"; "math/big"; "unsafe"; "time"; "flag"
)
const bsize=512
type palgen func() *big.Int
var (
One=big.NewInt(1)
cmax int
)
var tictime float64
func tic() {
now:=time.Now()
tictime=float64(now.Unix())+1.0E-9*float64(now.Nanosecond())
}
func toc() float64 {
now:=time.Now()
return float64(now.Unix())+1.0E-9*float64(now.Nanosecond())-tictime
}
func o1string(b []byte) string { // avoid the heap
return *(*string)(unsafe.Pointer(&b))
}
func mkepalgen(p *big.Int, b int) palgen {
var z,t big.Int
x:=make([]byte,bsize)
z.Set(p)
f:=func() *big.Int {
x=x[:0]
x=z.Append(x,b)
z.Add(&z,One)
m:=len(x)
x=x[:2*m]
for i:=m;i<len(x);i++ {
x[i]=x[len(x)-i-1]
}
t.SetString(o1string(x),b)
return &t
}
return f
}
func mkopalgen(p *big.Int, b int) palgen {
var z,t big.Int
x:=make([]byte,bsize)
z.Set(p)
f:=func() *big.Int {
x=x[:0]
x=z.Append(x,b)
z.Add(&z,One)
m:=len(x)
x=x[:2*m-1]
for i:=m;i<len(x);i++ {
x[i]=x[len(x)-i-1]
}
t.SetString(o1string(x),b)
return &t
}
return f
}
func mkpalgen(p *big.Int, b int) palgen {
var t big.Int
fe:=mkepalgen(p,b)
fo:=mkopalgen(p,b)
te:=fe()
to:=fo()
f:=func() *big.Int {
r:=te.Cmp(to)
if r<=0 {
t.Set(te)
te=fe()
}
if r>=0 {
t.Set(to)
to=fo()
}
return &t
}
return f
}
var Bn=[]int{2,10}
type pt struct { t *big.Int; f palgen }
func main(){
Printf("palgen--Generator approach to multi-base palindromes\n"+
"Written 2021 by Eric Olson\n\n")
flag.IntVar(&cmax,"n",19,"Number of palindromes to find.")
flag.Parse()
Pn:=make([]pt,len(Bn))
for n:=0;n<len(Bn);n++ {
Pn[n].f=mkpalgen(One,Bn[n])
Pn[n].t=Pn[n].f()
}
Printf("Finding %d palindomes in bases %d...\n\n",cmax,Bn)
tic()
count:=0
sum:=big.NewInt(0)
for {
nmin:=0
nmax:=0
for n:=1;n<len(Bn);n++ {
if Pn[nmax].t.Cmp(Pn[n].t)<0 { nmax=n }
if Pn[n].t.Cmp(Pn[nmin].t)<0 { nmin=n }
}
if Pn[nmin].t.Cmp(Pn[nmax].t)==0 {
count++
sum.Add(sum,Pn[nmin].t)
Println(Pn[nmin].t)
}
Pn[nmin].t=Pn[nmin].f()
if count>=cmax { break }
}
tsec:=toc()
Printf(
"\nSum of all multi-base palindromes: %d\n"+
"Calculation time %g seconds.\n",sum,tsec)
Exit(0)
}
Code: Select all
$ ./palgen
palgen--Generator approach to multi-base palindromes
Written 2021 by Eric Olson
Finding 19 palindomes in bases [2 10]...
1
3
5
7
9
33
99
313
585
717
7447
9009
15351
32223
39993
53235
53835
73737
585585
Sum of all multi-base palindromes: 872187
Calculation time 0.0054168701171875 seconds.
Don't know about Haskell, but Rust supports an ARMV6-M target.
Code: Select all
Option default integer
Timer =0
Dim b$ length 6
Dim c$ length 6
Dim d$,e$
For i=1 To 999999
b$=Str$(i)
strrev b$,c$
If b$=c$ Then
d$=Bin$(i)
strrev d$,e$
If d$=e$ Then
Print b$,d$
Inc j,i
EndIf
EndIf
Next i
Print Timer/1000 " seconds. Total is ",j
End
CSub strrev
00000000
B085B480 6078AF00 687B6039 60BB781B B2DA68BB 701A683B 60FB2301 68BAE00D
1AD368FB 687A3301 6839441A 440B68FB 701A7812 330168FB 68BB60FB 68FA3301
D3EC429A BF00BF00 46BD3714 4770BC80
End CSub
It looks like the C binary is included as a sequence of hexadecimal values at the end of the program. Would it be possible to post the C source and explain where those numbers came from?matherp wrote: ↑Mon Sep 20, 2021 10:19 amHere is a version for MMBasic on the Pico
Uses the ability of MMBasic to call a C routine to reverse the string. Takes 72.1 seconds with a single CPU clocked at 250MHz. Lots of optimisation could of course be done but it demoes the interpreterCode: Select all
Option default integer Timer =0 Dim b$ length 6 Dim c$ length 6 Dim d$,e$ For i=1 To 999999 b$=Str$(i) strrev b$,c$ If b$=c$ Then d$=Bin$(i) strrev d$,e$ If d$=e$ Then Print b$,d$ Inc j,i EndIf EndIf Next i Print Timer/1000 " seconds. Total is ",j End CSub strrev 00000000 B085B480 6078AF00 687B6039 60BB781B B2DA68BB 701A683B 60FB2301 68BAE00D 1AD368FB 687A3301 6839441A 440B68FB 701A7812 330168FB 68BB60FB 68FA3301 D3EC429A BF00BF00 46BD3714 4770BC80 End CSub
See the first post in this threadIt looks like the C binary is included as a sequence of hexadecimal values at the end of the program. Would it be possible to post the C source and explain where those numbers came from?
Code: Select all
'palindrome numbers are symetrical. in base 10 they should have the form abccba from numbers under 1e6.
'in base 2 they cannot end in a zero, becuase the first bit cannot be zero. This limits to odd numbers.
'in base 10 you will have to ripple through abccba and abcba and abba and aba and aa and a
OPTION DEFAULT INTEGER
Timer = 0
Total = 0
'6 digits
For a=1 To 9 Step 2
k=a*100001
For b=0 To 9
l=b*10010
For c=0 To 9
x=c*1100+l+k
bincheck
Next c,b,a
'5 digits
For a=1 To 9 Step 2
k=a*10001
For b=0 To 9
l=b*1010
For c=0 To 9
x=c*100+l+k
bincheck
Next c,b,a
'4 digits
For a=1 To 9 Step 2
k=a*1001
For b=0 To 9
x=b*110+k
bincheck
Next b,a
'3 digits
For a=1 To 9 Step 2
k=a*101
For b=0 To 9
x=b*10+k
bincheck
Next b,a
'2 digits
For a=1 To 9 Step 2
x=a*11
bincheck
Next a
'1 digits, these are all symetrical (b1,b11,b101,b111,b1001)
For x=1 To 9 Step 2
Total = Total + x
Next x
Print Timer
Print Total
' checking the binary string for symmetry
' the lsb is always 1 (odd), the msb is always 1 (never start with 0)
' so we only check the other digits
' for strings with an odd length we do not test the center bit.
Sub bincheck
y$=Bin$(x) : w=Len(y$) : z=Int(w/2)
'Print y$,z,w
pali=0
For i=2 To z
'Print Mid$(y$,i,1),Mid$(y$,w+1-i,1)
If Mid$(y$,i,1)=Mid$(y$,w+1-i,1) Then
pali=pali+1
End If
Next i
If pali = z-1 Then
Total = Total + x
Print y$," pali"
End If
End Sub
Not ported code but same principle. Generates known odd numbered decimal palindromes, then checks if they are binary palindromes using PIO to reverse the bits - 0.269 seconds at 250 MHz.
Code: Select all
import machine
from rp2 import StateMachine, asm_pio
import time
machine.freq(250_000_000)
@asm_pio()
def ReverseBits():
pull(block)
mov(isr, reverse(osr))
push()
sm = StateMachine(0, ReverseBits)
sm.active(1)
@micropython.native
def Main():
def Test(n):
global sum
sm.put(n)
x = sm.get()
while (x & 1) == 0:
x = x >> 1
if n == x:
sum += n
def f(n, div, mpy):
return ((n // div) % 10) * mpy
for n in range(1, 10, 2):
Test(n) # a
Test(n + f(n, 1, 10)) # aa
Test(n + f(n, 1, 100)) # a0a
Test(n + f(n, 1, 1000)) # a00a
Test(n + f(n, 1, 10000)) # a000a
Test(n + f(n, 1, 100000)) # a0000a
for n in range(11, 100, 2):
Test(n + f(n, 1, 10000) + f(n, 10, 1000)) # ab0ba
Test(n + f(n, 1, 100000) + f(n, 10, 10000)) # ab00ba
for n in range(101, 1000, 2):
if f(n, 100, 1) == f(n, 1, 1):
Test(n) # aba and aaa
if f(n, 100, 1) == f(n, 10, 1):
Test(n + f(n, 1, 1000)) # abba amd aaaa
Test(n + f(n, 1, 10000) + f(n, 10, 1000)) # abcba
Test(n + f(n, 1, 100000) + f(n, 10, 10000) + f(n, 100, 1000)) # abccba
t = time.ticks_ms()
sum = 0
Main()
t = time.ticks_ms() - t
print("Sum is {}".format(sum))
print("Run time is {} seconds @ {} MHz".format(t / 1000, machine.freq() / 1_000_000))
Code: Select all
Sum is 872187
Run time is 0.269 seconds @ 250.0 MHz
Code: Select all
Timer =0
GoTo start
bc:
y$=Bin$(x):w=Len(y$):z=w\2
p=0
For i=2 To z
If Peek(byte q+i)=Peek(byte q+w+1-i) Then
Inc p
EndIf
Next
If p = z-1 Then
Inc T,x
EndIf
Return
start:
Option DEFAULT INTEGER
Dim y$
Dim q=Peek(varaddr y$)
For a=100001 To 900009 Step 200002
For b=0 To 90090 Step 10010
x=b+a
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Inc x,1100
GoSub bc
Next :Next
For a=10001 To 90009 Step 20002
For b=0 To 9090 Step 1010
x=b+a
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Inc x,100
GoSub bc
Next :Next
For a=1001 To 9009 Step 2002
For b=0 To 990 Step 110
x=b+a
GoSub bc
Next b,a
For a=101 To 909 Step 202
For b=0 To 90 Step 10
x=b+a
GoSub bc
Next :Next
For x=11 To 99 Step 22
GoSub bc
Next
For x=1 To 9 Step 2
Inc T,x
Next
Print Timer
Print T
Code: Select all
def Test(n):
global sum
b = bin(n)[2:]
for x in range(len(b) >> 1):
if b[x] != b[-(x+1)]:
return
sum += n
Code: Select all
Sum is 872187
Run time is 0.768571 seconds @ 250.0 MHz